We are excited to release Jina 0.9.2. Jina is the easier way to do neural search in the cloud. Highlights of this release include:

Support for delete/update operations
Add native AsyncIO support and unlock native support for running Jina in Jupyter notebooks and IPython
Add MultiModalDocument as primitive types to support multimodal search in a Pythonic way
Refactor Pea and introduce Runtime to improve code readability and maintainability

Release 0.9.2

⬆️ Major Features and Improvements

Completeness

To support for updating and deletion operations, we introduced update and delete method for BaseKVIndexer and BaseVectorIndexer; update and delete APIs are introduced in Flow APIs. #1380, #1415, #1550, #1460
Refactoring to native asyncio. This unlocks support for running Jina in Jupyter notebooks and IPython. AsyncClient and AsyncFlow were added to let users manage eventloop and make Jina more reliable in Jupyter notebooks and IPython #1348, #1408, #1410, #1428, #1450, #1453, #1463, #1562

Click to see example

from jina import AsyncFlow
with AsyncFlow().add(uses='_logforward') as f:
    await f.index_lines(lines=['hello', 'jina'], on_done=print)

Ease of Use

Add MultiModalDocument as primitive type. This lets users build a multimodal search system in a Pythonic way. #1335, #1385, #1390, #1368, #1395, #1399, #1401

Click here for example code

from jina import Document, MultimodalDocument
chunk_img = Document(modality='dummy_image', embedding=np.random.rand(1, 4))
chunk_text = Document(modality='dummy_text', embedding=np.random.rand(1, 10))
multimodal_doc = MultimodalDocument(chunks=[chunk_img, chunk_text])

Introduce Runtime as a member of Pea, defined as "a procedure that blocks the main process once running, therefore must be put into a separated thread/process. The new architecture greatly improves the readability and maintainability of the code. #1426, #1473, #1487, #1539, #1577

⚠️ Breaking Changes

Introduce UniqueId, ChunkSet, DocumentSet, MatchSet; Remove add_chunk and add_match; Refactor Document with newly introduced classes. #1343

Click here for example code

0.8.0

0.9.2

from jina import Document()
with Document() as d:
    c = Document(id=f'1:0>16')
    d.chunks.append(c)

with Document() as d:
     c = d.chunks.append()
     c.id = f'1:0>16'

from jina import Document()
with Document() as d:
    c = d.chunks.add_chunks()
    c.id = f'1:0>16'

Refactor YAML file parsing backend from ruamel.yaml to pyyaml and introduce jina.jaml for parsing YAML files. The dependency on ruamel.yaml is deprecated. #1495, #1516, #1524, #1533, #1547, #1581
Add _merge_matches and _merge_chunks for merging messages in different ways. Remove _merge_all. #1406 #1418
PyClient renamed to Client for simplicity #1450

📗 Documentation

Update Korean Readme #1364 @doomdabo
Add code review guide #1397
Fix typos in helloworld.html. #1405 @harry-stark
Add documentation for recursive data structure. #1394
Fix redundant translation in Chinese Readme #1443 @smy0428
Fix missing CLI content. #1481
Fix typos in README.md. #1500 @Kavan72
Improve docs.jina.ai. #1513, #1514, #1586
Improve TorchDevice docstring #1499 @tadej-redstone
Fix typos in Russian Readme #1544, #1572 @git-webmaster
Fix typos in CLI interface #1578 @xinbinhuang
Add Spanish Readme #1579 @PabloRN

🐞 Bug Fixes and Other Changes

Flow

Fix issue terminating RemotePea #133
Refactor Pea closing logic #1379, #1398, #1457
Refactor peapods code base #1421
Add versioning for Flow YAML config files. Introduce method field for Flow YAML configurations. #1442
Add env filed for Flow and Pod YAML configuration so that shared environment variables can be set. #1446, #1448
Rename Flow output argument to on_done. #1476
Fix client top_k malfunctioning bug. #1522
Add return_list option for Flow API and introduce Response as new primitive type. When return_list=True, return results are a list of Response objects to make it easy to interpret. #1541
Fix CORS behavior bug for REST API #1568 @yk

Executors

Change default metric of NumpyIndexer to cosine #1393
Remove deprecated jina/executors/encoders/helper.py #1563 @tadejsv
Introduce batching_multi_input decorator to add batching support for rankers #1467 @deepampatel
Allow Indexers to have separate workspaces. #1383
Fix bug when shards are empty #1340, #1396

Drivers

Add op_name for Matches2DocRankDriver #1409
Add batch_size argument for EncodeDriver to enable batching on driver level #1483
Make DocIdCache capable of detecting collisions on content level #1510
Enable AggregateMatches2DocRankDriver for keeping chunks of matches #1494

Types

Add NamedScore as new primitive type. #1430
Support + and += operations for Document. #1555
Move extract_content() to DocumentSet. Instead of using docs = DocumentSet(random_docs(2)); extract_content(docs), docs.all_contents() makes it easier to get contents from a set of Documents. #1387
Refactor random_id and introduce content_hash field in Document. #1440

Tests

Improve unit tests for test_hello_world #1305
Refactor unit tests for queryset #1336
Refactor unit tests for evaluation #1339
Refactor unit tests for index remote #1346
Fix integration tests for jinad #1367, #1388, #1407
Refactor random_docs() in unit tests #1356
Add unit tests for convert functions in Document #1389
Fix callbacks in unit tests. callback failures had chance of being not captured by tests #1391
Fix integration tests for evaluation #1411
Refactor doctrings in unit tests of QueryLangSet #1417
Fix bug failing to capture errors of callbacks during unit tests. #1419, #1536
Refactor unit tests for types #1435
Refactor unit tests for request #1445
Add unit tests for corner cases in calculating similarity metrics #1434
Add evaluation option for hello-world #1465, #1488, #1508, #1501,
Add test for loading customized drivers #1474
Refactor unit test for drivers #1452
Set default value of eval_at in PrecisionEvaluator and RecallEvaluator to None #1552
Fix unit tests of test_hub_usage when GITHUB_TOKEN is used. #1560
Refactor unit tests for drivers #1559
Refactor unit tests in hubio to use BuildTestLevel #1361
Fix naming for test_rankingevaluation_driver #1573

HubIO

Fix Jina Hub automated updates and add GA for updating Jina Hub images. Check out more details at hub-updater #1298, #1345, #1360, #1456
Redefine naming convention of Docker images in Jina Hub. Naming follows {repository}/{type}.{kind}.{name}:{version}-{jina_version} #1341
Avoid overwriting Docker image in Jina Hub when tag already exists. #1365
Clean up hubio imports. #1381
Fix hubio version checking and add --no-overwrite option for jina hub --push #1403
Fix hubio test levels #1361
Add --timeout-ready option for hubio #1525
Fix typo in error message #1531
Fix access to token credential file for jina hub push #1492
Switch to hubapi for retrieving Docker login information #1429, #1589

Others

Adapt to new remote log APIs #1300
Adapt to Docker SDK 4.4.0 in ContainerPea #1334
Move log parser from jinad to core. #1342
Use load_config directly as a classmethod #1352, #1354
Fix bug during completing file path for errors #1353
Fix top-k setting bug #1359
Fix newlines for autocompletion in bash. #1425 @lsgrep
Fix latency check during CI #1437
Add client-side exception handlers #1458, #1462,
Add GA for automated comments on lint failures. #1486, #1507, #1519
Introduce ArgNamespace in jina.helper to manage all namespace-related operations #1489
Introduce training. #1518
Introduce jina.jaml for parsing YAML files. #1533, #1547, #1581
Fix bug in parsing config source files #1583

🙏 Thanks to our Contributors

This release contains contributions from Amritpal Singh, Bithiah Yuan, CatStark, Deepam Patel, Deepankar Mahapatro, Florian Hönicke, Han Xiao, Harry Stark, Hidan, Joan Fontanals, Nan Wang, Pratik Bhavsar, Rutuja Surve, Sergey M, Siyuan Shi, Szymon Skorupinski, Tadej Svetina, Wang Bo, Yannic Kilcher, Yusup, cristian, florian-hoenicke

🙏 Thanks to our Community

And thanks to all of you out there as well! Without you Jina couldn't do what we do. Your support means a lot to us.

🤝 Work with Jina

Want to work with Jina full-time? Check out our openings on our website.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

🎉 Release v0.9.2