Skip to content

🎉 Jina 0.8.0

Compare
Choose a tag to compare
@nan-wang nan-wang released this 23 Nov 09:47
· 4664 commits to master since this release

We are excited to release Jina 0.8.0. Jina is an easier way to do neural search on the cloud. Highlights of this release include:

  • Introduce jinad to improve experience of using remote Flows/Pods/Peas
  • Add support for multimodal search SparseArray
  • Add jina.types module to offer Pythonic interface to access and manipulate protobuf objects.

Release 0.8.0

⬆️ Major Features and Improvements

Ease of Use

  • We introduce two new ways of using Jina Pods remotely:
    • Create a remote Pod via SSH #1275
    • Create a remote Pod via jinad. Jinad is a daemon process working together with jina on remote machines. Jinad makes it even easier to deploy Jina Flows/Pods/Peas on remote machines. Find out more details in the README #1182, #1203, #1254, #1297, #1299, #1307, #1312, #1324
Click here for example code

RemoteSSHPod Jinad API
jina pod --host [email protected] --remote-access SSH

jina pod --host 11.22.33.44 --port-expose 8000 --remote-access JINAD

With jinad, you can create and use Pods directly from the Flow as well: Start the Docker container equipped with jinad on the remote machine as follows:

sudo docker run --rm -d --network host jinaai/jinad

Now you can directly create and use the remote pods from your local machine:

f = (Flow()
     .add(name='p1', uses='_logforward')
     .add(name='p2', host='10.11.22.33', port_expose='8000', uses='_logforward')
with f:
     f.search_lines(lines=['jina', 'is', 'cute'], output_fn=print)
  • We've added jina.types module, which offers a Pythonic interface to access and manipulate protobuf objects. The main types include Request, QueryLang, NdArray, Message, and Document. With the help of Jina types, you can construct inputs to Jina much more easily than before. #1283, #1284, #1289, #1323
Click here for example code

v0.7.0 v0.8.0
Document
from jina.proto import jina_pb2
d = jina_pb2.DocumentProto()
d.text = 'hello world'

from jina import Document
d = Document()
d.text = 'abc'

Request
from jina.proto import jina_pb2
r = jina_pb2.Request()
d = r.docs.add()

from jina.types.request import Request
from jina.types.document import Document
r = Request()
d = Document()
r.add_document(d)

Message
from jina.proto import jina_pb2
r = jina_pb2.RequestProto.IndexRequestProto()
m = jina_pb2.MessageProto()
m.envelop = None
m.request = r

from jina.types.message import Message
from jina.types.request import Request
r = Request()
m = Message(None, r)

QueryLang
from jina.proto import jina_pb2
ql = jina_pb2.QueryLangProto(name='SliceQL')
ql.parameters['start'] = 1
ql.parameters['end'] = 3

from jina.types.querylang import QueryLang
ql = QueryLang(SliceQL(start=1, end=3))

NdArray
from jina.proto import jina_pb2
from jina.drivers.helper import array2pb
a = jina_pb2.jina_pb2.NdArrayProto()
a.CopyFrom(array2pb(np.ndarray([2, 17])))

from jina.types.ndarray.generic import NdArray
a = NdArray()
a.value = np.ndarray([2, 17])

Completeness

⚠️ Breaking Changes

  • Refactor drivers for evaluation from function-based to type-based. #1165

    • Removed EncodeEvaluationDriver and CraftEvaluationDriver
    • TextEvaluateDriver, NDArrayEvaluateDriver, and FieldEvaluateDriver
    • RankingEvaluationDriver renamed to RankEvaluateDriver
  • Introduce SparseNdArray and provide generic interface for SparseNdArray and DenseNdArray #1190, #1283

Click here for example code

v0.7.0 v0.8.0
dense array
from jina.proto import jina_pb2
from jina.proto import jina_pb2
from jina.drivers.helper import array2pb
a = jina_pb2.jina_pb2.NdArrayProto()
a.CopyFrom(array2pb(np.ndarray([2, 17])))

from jina.types.ndarray.generic import NdArray
a = NdArray()
a.value = np.ndarray([2, 17])

sparse array
not support

from jina.types.ndarray.generic import NdArray
from .sparse.scipy import SparseNdArray
from scipy.sparse import coo_matrix
row = np.array([20, 0])
col = np.array([0, 20])
data = np.array([2, 17])
a = NdArray(is_sparse=True, sparse_cls=SparseNdArray)
a.value = coo_matrix((data, (row, col)), shape=(21, 21))

  • Add callback_on and continue_on_error fot the client. callback_on_body is removed. #1265
Click here for example code

v0.7.0 v0.8.0
from jina.flow import Flow
f = (Flow().add(name='p1').add(name='p2'))

with f:
    f.search_lines(lines=['hello', 'jina'], callback_on_body=True)

from jina.flow import Flow
f = (Flow().add(name='p1').add(name='p2'))

with f:
    f.search_lines(lines=['hello', 'jina'], callback_on='body')

  • Add ProtoMessage, LazyRequest to replace the original jina_pb2.Message and jina_pb2.Request so that the protobuf message is deserialized in a lazy way #1210, #1283
Click here for example code

v0.7.0 v0.8.0
from jina.proto import jina_pb2
r = jina_pb2.RequestProto.IndexRequestProto()
m = jina_pb2.MessageProto()
m.envelop = None
m.request = r

from jina.types.message import Message
from jina.types.request import Request
r = Request()
m = Message(None, r)

🐞 Bug Fixes and Other Changes

Flow

  • Fix argument overridden bug for Pod when passing arguments from Flow #1189
  • Refactor num_part logic #1247
  • Enable client to interpret dict of json-like str into parsed documents #1282
  • Besides callback function for Flow API, three more actions added for postprocessing requests on_done, on_error, on_always #1303

Protos

  • Use Docker container to generate protobuf files #1241, #1242

Drivers

  • Refactor over-reduce logic to BaseDriver. Move ReduceDriver function into BaseDriver. Merge PassDriver and RouteDriver into RouteDriver #1228
  • Adapt the Drivers to the jina.type #1313,

Tests

  • Remove pip cache from Docker images #1168
  • Refactor unit tests for ContainerPea to pytest #1179
  • Switch back to use S3 bucket instead of GitHub for accessing fashionmnist dataset #1183
  • Refactor unit tests for CompoundExecutors to pytest #1192
  • Refactor unit tests for hello-world to pytest #1263
  • Refactor unit tests for indexing to pytest. #1258, #1237
  • Add unit tests for southpark example #1218
  • Fix flaky test #1219
  • Remove legacy code #1291, #1314
  • Adapt unit tests to jina.type #1319, #1320, #1322

Usability

  • Add --repository option for jina hub cli so users can push Pod images to their own repository. #1175
  • Replace id_tag argument with field in RankEvaluateDriver so users can access all fields of matches #1176

Documentation

Others

  • Add Black coding check for ci. #1146
  • Replace PretrainedModelFileDoesNotExist with ModelCheckpointNotExist. PretrainedModelFileDoesNotExist will be deprecated after removing all hub executors that use ModelCheckpointNotExist #1180
  • Update extra-requirements.txt. Remove unnecessary dependencies. #1208
  • Fix pull and dev-obt messages #1233
  • Improve ExceptionHandler and print aggregated error message as default callback for Python client #1238
  • Replace replica_id with pea_id for clarification #1222
  • Combine code-formatting, contributors, and copyright into one core-automate.yml to avoid merging conflicts. #1220
  • Replace %-strings with f-strings #1256
  • Remove unused code in jina/peapods/container.py #1266
  • Remove SpawnRequest #1276
  • Improve style of hello-world #1270
  • Pass ID to logger #1232
  • Update jina-hub images automatically #1274, #1292, #1294, #1296,
  • Adapt to deprecation of set-env in GA. #1301
  • Remove hex from identity #1318

🙏 Thanks to our Contributors

This release contains contributions from Alex Cureton-Griffiths, Anshul Wadhawan, Bing, Deepankar Mahapatro, Han Xiao, Joan Fontanals, Maximilian Werk, Nan Wang, Nicholas Chin, Pratik Bhavsar, Rutuja Surve, Wang Bo, Yongxuanzhang, cristian, hoenickf, pswu11

🙏 Thanks to our Community

And thanks to all of you out there as well! Without you, Jina couldn't do what we do. Your support means a lot to us.

🤝 Work with Jina

Want to work with Jina full-time? Check out our openings on our website.