Skip to content

🎉 Release v0.9.2

Compare
Choose a tag to compare
@github-actions github-actions released this 05 Jan 20:14

We are excited to release Jina 0.9.2. Jina is the easier way to do neural search in the cloud. Highlights of this release include:

  • Support for delete/update operations
  • Add native AsyncIO support and unlock native support for running Jina in Jupyter notebooks and IPython
  • Add MultiModalDocument as primitive types to support multimodal search in a Pythonic way
  • Refactor Pea and introduce Runtime to improve code readability and maintainability

Release 0.9.2

⬆️ Major Features and Improvements

Completeness

Click to see example

async_flow

from jina import AsyncFlow
with AsyncFlow().add(uses='_logforward') as f:
    await f.index_lines(lines=['hello', 'jina'], on_done=print)

Ease of Use

Click here for example code
from jina import Document, MultimodalDocument
chunk_img = Document(modality='dummy_image', embedding=np.random.rand(1, 4))
chunk_text = Document(modality='dummy_text', embedding=np.random.rand(1, 10))
multimodal_doc = MultimodalDocument(chunks=[chunk_img, chunk_text])
  • Introduce Runtime as a member of Pea, defined as "a procedure that blocks the main process once running, therefore must be put into a separated thread/process. The new architecture greatly improves the readability and maintainability of the code. #1426, #1473, #1487, #1539, #1577

⚠️ Breaking Changes

  • Introduce UniqueId, ChunkSet, DocumentSet, MatchSet; Remove add_chunk and add_match; Refactor Document with newly introduced classes. #1343
Click here for example code
0.8.0 0.9.2
from jina import Document()
with Document() as d:
    c = Document(id=f'1:0>16')
    d.chunks.append(c)

with Document() as d:
     c = d.chunks.append()
     c.id = f'1:0>16'    
from jina import Document()
with Document() as d:
    c = d.chunks.add_chunks()
    c.id = f'1:0>16'  
  • Refactor YAML file parsing backend from ruamel.yaml to pyyaml and introduce jina.jaml for parsing YAML files. The dependency on ruamel.yaml is deprecated. #1495, #1516, #1524, #1533, #1547, #1581

  • Add _merge_matches and _merge_chunks for merging messages in different ways. Remove _merge_all. #1406 #1418

  • PyClient renamed to Client for simplicity #1450

📗 Documentation

🐞 Bug Fixes and Other Changes

Flow

  • Fix issue terminating RemotePea #133
  • Refactor Pea closing logic #1379, #1398, #1457
  • Refactor peapods code base #1421
  • Add versioning for Flow YAML config files. Introduce method field for Flow YAML configurations. #1442
  • Add env filed for Flow and Pod YAML configuration so that shared environment variables can be set. #1446, #1448
  • Rename Flow output argument to on_done. #1476
  • Fix client top_k malfunctioning bug. #1522
  • Add return_list option for Flow API and introduce Response as new primitive type. When return_list=True, return results are a list of Response objects to make it easy to interpret. #1541
  • Fix CORS behavior bug for REST API #1568 @yk

Executors

  • Change default metric of NumpyIndexer to cosine #1393
  • Remove deprecated jina/executors/encoders/helper.py #1563 @tadejsv
  • Introduce batching_multi_input decorator to add batching support for rankers #1467 @deepampatel
  • Allow Indexers to have separate workspaces. #1383
  • Fix bug when shards are empty #1340, #1396

Drivers

  • Add op_name for Matches2DocRankDriver #1409
  • Add batch_size argument for EncodeDriver to enable batching on driver level #1483
  • Make DocIdCache capable of detecting collisions on content level #1510
  • Enable AggregateMatches2DocRankDriver for keeping chunks of matches #1494

Types

  • Add NamedScore as new primitive type. #1430
  • Support + and += operations for Document. #1555
  • Move extract_content() to DocumentSet. Instead of using docs = DocumentSet(random_docs(2)); extract_content(docs), docs.all_contents() makes it easier to get contents from a set of Documents. #1387
  • Refactor random_id and introduce content_hash field in Document. #1440

Tests

  • Improve unit tests for test_hello_world #1305
  • Refactor unit tests for queryset #1336
  • Refactor unit tests for evaluation #1339
  • Refactor unit tests for index remote #1346
  • Fix integration tests for jinad #1367, #1388, #1407
  • Refactor random_docs() in unit tests #1356
  • Add unit tests for convert functions in Document #1389
  • Fix callbacks in unit tests. callback failures had chance of being not captured by tests #1391
  • Fix integration tests for evaluation #1411
  • Refactor doctrings in unit tests of QueryLangSet #1417
  • Fix bug failing to capture errors of callbacks during unit tests. #1419, #1536
  • Refactor unit tests for types #1435
  • Refactor unit tests for request #1445
  • Add unit tests for corner cases in calculating similarity metrics #1434
  • Add evaluation option for hello-world #1465, #1488, #1508, #1501,
  • Add test for loading customized drivers #1474
  • Refactor unit test for drivers #1452
  • Set default value of eval_at in PrecisionEvaluator and RecallEvaluator to None #1552
  • Fix unit tests of test_hub_usage when GITHUB_TOKEN is used. #1560
  • Refactor unit tests for drivers #1559
  • Refactor unit tests in hubio to use BuildTestLevel #1361
  • Fix naming for test_rankingevaluation_driver #1573

HubIO

  • Fix Jina Hub automated updates and add GA for updating Jina Hub images. Check out more details at hub-updater #1298, #1345, #1360, #1456
  • Redefine naming convention of Docker images in Jina Hub. Naming follows {repository}/{type}.{kind}.{name}:{version}-{jina_version} #1341
  • Avoid overwriting Docker image in Jina Hub when tag already exists. #1365
  • Clean up hubio imports. #1381
  • Fix hubio version checking and add --no-overwrite option for jina hub --push #1403
  • Fix hubio test levels #1361
  • Add --timeout-ready option for hubio #1525
  • Fix typo in error message #1531
  • Fix access to token credential file for jina hub push #1492
  • Switch to hubapi for retrieving Docker login information #1429, #1589

Others

  • Adapt to new remote log APIs #1300
  • Adapt to Docker SDK 4.4.0 in ContainerPea #1334
  • Move log parser from jinad to core. #1342
  • Use load_config directly as a classmethod #1352, #1354
  • Fix bug during completing file path for errors #1353
  • Fix top-k setting bug #1359
  • Fix newlines for autocompletion in bash. #1425 @lsgrep
  • Fix latency check during CI #1437
  • Add client-side exception handlers #1458, #1462,
  • Add GA for automated comments on lint failures. #1486, #1507, #1519
  • Introduce ArgNamespace in jina.helper to manage all namespace-related operations #1489
  • Introduce training. #1518
  • Introduce jina.jaml for parsing YAML files. #1533, #1547, #1581
  • Fix bug in parsing config source files #1583

🙏 Thanks to our Contributors

This release contains contributions from Amritpal Singh, Bithiah Yuan, CatStark, Deepam Patel, Deepankar Mahapatro, Florian Hönicke, Han Xiao, Harry Stark, Hidan, Joan Fontanals, Nan Wang, Pratik Bhavsar, Rutuja Surve, Sergey M, Siyuan Shi, Szymon Skorupinski, Tadej Svetina, Wang Bo, Yannic Kilcher, Yusup, cristian, florian-hoenicke

🙏 Thanks to our Community

And thanks to all of you out there as well! Without you Jina couldn't do what we do. Your support means a lot to us.

🤝 Work with Jina

Want to work with Jina full-time? Check out our openings on our website.