Skip to content

🎉 Jina 1.1

Compare
Choose a tag to compare
@github-actions github-actions released this 29 Mar 12:55

Jina 1.1

We are excited to release Jina 1.1 Jina is the easier way to do neural search in the cloud.

⬆️ Major Features and Improvements

Permanently delete documents from storage.

When you delete a Document, you’d er kinda expect it to disappear right? Previously in Jina we only marked those Documents as inaccessible’ and you had to completely reindex your entire dataset to actually remove them. From version 1.1 onwards, if you set the delete_on_dump value to true in your YAML files, deleted documents will be truly deleted when the flow is restarted. You can find out more about this in our CRUD documentation. Thanks to the community members who requested this feature, and to our engineers who found out that CRUD is actually kinda hard to build than expected. (Related PRs: #2046, #2102, #2150, #2144 )

Distributed Remote Peas in a single Pod.

As users of Jina will know, it is possible to have multiple Peas running inside a single Pod. As of 1.1, it is now possible to distribute these Peas across remote locations. Using JinaD, you could have two Peas running on one machine, and a third Pea running on a remote machine. The machine could be located on a sunny beach in the mediterranean drinking cocktails. sorry lost my train of thought. Anyway, load balancing will always be handled automatically by the Headpea! For information on how to implement this feature, check out Development Guide: Peas and Pods in Jina.. (Related PRs: #2143, #2241)

Support to learning-to-rank models

You’re running an e-commerce site, your user searches for black shoes, Jina finds one hundred black shoes in your database. For users who don’t have two hundred pairs of feet, how best do you display these one hundred results?

It’s common in the industry to use learning algorithms to rank these ranks based on the users' clicking behaviors. You can now use your LightGBMRanker decision tree in jina! To find out how check out the Jina Hub documentation. (Related PRs: #1953)

Support YAML IntelliSense to make writing YAML configurations easier.

Life is hard enough these days, but we at Jina are trying to make it as easy for you as possible and hopefully bring some joy into your life.
This month we made it even easier for you to write YAML files to create Jina flows. We now provide a JSON Schema for your IDE to enable code completion, syntax validation, members listing, and displaying help text. Here is a video tutorial to walk you through the setup. If you need more happiness then this, we suggest going for a walk in the sun 🌞 (Related PRs: #2059, #2066)

Better performance when applying functions recursively on Documents.

The ability to recursively break down Documents is a core feature of Jina. If you have photos containing families of cats, but want to search individual cats, you can segment and chunk on these images. Increasing accuracy for your system and making cat lovers around the world happy.

🏎️ These operations are now faster and have improved performance. Our team moved the traversing logic into the primitive types and adapted batching logic accordingly. #1950, #2145, #2196

⚠️ Breaking Changes

  • Remove the CLI option --optimize-level #1975
  • Remove unused pipeline encoder #2033
  • Replace is_merge argument with is_update in KVSearchDriver. #1296
  • Add update method to Document for updating the object. #1296
  • Refactor CLI for running jina hello-world. jina hello-world is deprecated. jina hello mnist, jina hello chat-bot are added. #1985
  • Remove PipeLogger #2003
  • Add keyword jtype to represent YAML tag #2044, 2061
  • Replace DockerKwargsAppendAction with KVAppendAction in CLI #2050
  • Rename --silent-remote-logs to --quiet-remote-logs. #2122

📗 Documentation

🐞 Bug Fixes and Other Changes

Flow

  • Improve the typing in sugary_io.py. #2049
  • Add support to specify hosts for peas #2143 #2241

Executors

  • Replace field argument in DocCache with fields to support caching with multiple fields as key. #1970, #2032
  • Fix the issue of closing loggers #2019
  • Refactor input_fn to inputs. #2054
  • Enable the usage of dynamic workspace for indexers. #2114
  • Fix the typos in the column names in BaseRanker #1973
  • Refactor fill-in drivers logic. #2134
  • Avoid dumping drivers when storing executors. #2133
  • Fix a bug in passing metavar #2155

Drivers

  • Add fields argument to RankEvaluateDriver so that one can pass multiple fields to rankers. #1953
  • Add into Drivers support to match_required_keys and query_required_keys in Executors. #1947
  • Add fields argument to BasePredictDriver to enable the usage of either embedding or content of Documents when doing classification #1957, #1995
  • Remove the inheritance of RecursiveMixin in RecursiveMixin #1980

Types

  • Add convert_buffer_image_to_blob, convert_uri_to_blob and convert_data_uri_to_blob to Document for converting data to blob. #1929, #1930, #1982
  • Refactor warnings when drivers failed to read Document into Document #1930
  • Move queryset from jina/drivers to jina/types #1933
  • Remove the legacy traverse function from Document #2017
  • Keep the mime type of the matches as the same as the reference Document instead of the query Document #2025
  • Enable to extract arbitrary attributes from Document by Drivers #2110, #2192

Tests

  • Refactor tests/unit/drivers/querylang/test_querylang_drivers.py to use types. #1963
  • Add unit tests for JsonFormatter and ProfileFormatter #1983
  • Add unit tests for request_generator in jina.clients.request.asyncio #1992
  • Add more unit tests for Flow #1991
  • Add unit tests for profiling #2016
  • Fix doc generator in the unit tests of Flow #2008
  • Add more unit tests for Document #2012
  • Add more unit tests for hubio #1972
  • Add more unit tests for types/score #2029
  • Add unit tests for ImportChecker #2042
  • Add tests for BaseMindsporeEncoder. #2058
  • Add integration tests for RESTful APIs. #2104
  • Refactor unit tests for RankDriver #2135
  • Remove folder after unit tests for CRUD. #2162
  • Add unit tests for Pods. #2164
  • Add integration tests for jina hello. #2218

HubIO

  • Add support to select Dockerfile #2056
  • Add support to multistage Dockerfile #2037

Others

  • Add pre-commit hooks for docstrings #1925, #1962
  • Add logging to the HubIO #1923
  • Add use_default keyword under requests to executors YAML configs. When use_default: true, we explicitly use the default setting under jina/resources; Add with keyword under each request type to avoid repeating configuration for different drivers. #1959, #1967, #2053
  • Fix CI to exit elegantly when a failure generates #1990
  • Add codecov for daemon #1993
  • Add darglint to check docstring during CI #2007, #2035, #2048, #2045, #2199
  • Add jina hello fashion as a multimodal search example #2002
  • Fix the RESTful API documentation in openapi #1994
  • Add JINA_LOG_CONFIG environment variable and add quite argument for JinaLogger. When setting JINA_LOG_CONFIG= QUIET, quite mode is used for logging. #2004
  • Replace --hide-exc-info with --quiet-error in CLI; add --description to check show descriptions of the object in CLI #2005
  • Refactor the hello-world file structure #2010
  • Remove the unused codes #2041
  • Fix representation for eum when dumping #2047
  • Add shell checker for colab #2052
  • Add deployment for cloud.jina.ai in automated release. #2099
  • Add context manager for BaseFlow to solve the broken CI #2100
  • Add black to make the code styling consistent #2036, #2117, #2211
  • Enable pydantic-based json schema for REST APIs. #2121, #2146
  • Refactor the typing in jina/clients. #2014
  • Remove the deprecated NTLogger. #2138
  • Refactor the NotImplementedError usage. #2147
  • Improve the package importing error processing using the context manage ImportExtensions. #2204, #2237,
  • Refactor the functions in Zmqlet #2208
  • Add PngToDiskDriver for debugging purpose #2209
  • Add unit tests for ZMQManyRuntime #2230

🙏 Thanks to our Contributors

This release contains contributions from @cristianmtr @hanxiao @JoanFM @florian-hoenicke @nan-wang @Yongxuanzhang @bwanglzu @alexcg1 @deepampatel @LukeekuL @deepankarm @davidbp @FionnD @theUnkownName @atibaup

🙏 Thanks to our Community

And thanks to all of you out there as well! Without you Jina couldn't do what we do. Your support means a lot to us.

🤝 Work with Jina

Want to work with Jina full-time? Check out our openings on our website.