Release 💫 Release v3.13.0 · jina-ai/serve

Release Note (`3.13.0`)

Release time: 2022-12-15 15:33:43

This release contains 14 new features, 9 bug fixes and 7 documentation improvements.

This release introduces major features like Custom Gateways, Dynamic Batching for Executors, development support with auto-reloading, support for the new namespaced Executor scheme jinaai, improvements for our gRPC transport layer, and more.

🆕 Features

Custom Gateways (#5153, #5189, #5342, #5457, #5465, #5472 and #5477)

Jina Gateways are now customizable in the sense that you can implement them in much the same way as an Executor. With this feature, Jina gives power to the user to implement any server, protocol or interface at the Gateway level. There's no more need to build an extra service that uses the Flow.

For instance, you can define a Jina Gateway that communicates with the rest of Flow Executors like so:

from docarray import Document, DocumentArray
from jina.serve.runtimes.gateway.http.fastapi import FastAPIBaseGateway


class MyGateway(FastAPIBaseGateway):
    @property
    def app(self):
        from fastapi import FastAPI

        app = FastAPI(title='Custom FastAPI Gateway')

        @app.get(path='/service')
        async def my_service(input: str):
            # convert input request to Documents
            docs = DocumentArray([Document(text=input)])

            # send Documents to Executors using GatewayStreamer
            result = None
            async for response_docs in self.streamer.stream_docs(
                docs=docs,
                exec_endpoint='/',
            ):
                # convert response docs to server response and return it
                result = response_docs[0].text

            return {'result': result}

        return app

Then you can use it in your Flow in the following way:

flow = Flow().config_gateway(uses=MyGateway, port=12345, protocol='http')

A Custom Gateway can be used as a Python class, YAML configuration or Docker image.

Adding support for Custom Gateways required exposing the Gateway API and supporting multiple ports and protocols (mentioned in a prior release). You can customize it by subclassing the FastAPIBaseGateway class (for simple implementation) or base Gateway for more complex use cases.

Working on this feature also involved exposing and improving the GatewayStreamer API as a way to communicate with Executors within the Gateway.

Find more information in the Custom Gateway page.

Dynamic batching (#5410)

This release adds Dynamic batching capabilities to Executors.

Dynamic batching allows requests to be accumulated and batched together before being sent to an Executor. The batch is created dynamically depending on the configuration for each endpoint.

This feature is especially relevant for inference tasks where model inference is more optimized when batched to efficiently use GPU resources.

You can configure Dynamic batching using either a decorator or the uses_dynamic_batching parameter. The following example shows how to enable Dynamic batching on an Executor that performs model inference:

from jina import Executor, requests, dynamic_batching, Flow, DocumentArray, Document
import numpy as np
import torch


class MyExecutor(Executor):
    def __init__(self, **kwargs):
        super().__init__(**kwargs)

        # initialize model
        self.model = torch.nn.Linear(in_features=128, out_features=128)

    @requests(on='/bar')
    @dynamic_batching(preferred_batch_size=10, timeout=200)
    def embed(self, docs: DocumentArray, **kwargs):
        docs.embeddings = self.model(torch.Tensor(docs.tensors))


flow = Flow().add(uses=MyExecutor)

With Dynamic Batching enabled, the Executor above will efficiently use GPU resources to perform inference by batching Documents together.

Read more about the feature in the Dynamic Batching documentation page.

Install requirements of local Executors (#5508)

Prior to this release, the install_requirements parameter of Executors only installed Executor requirements for Hub Executors. Now, local Executors with a requirements.txt file will also have their requirements installed before starting Flows.

Support `jinaai` Executor scheme to enable namespaced Hub Executors (#5462, #5468 and #5515)

As Jina AI Cloud introduced namespaces to Executor resources, we made changes to support the new jinaai Executor scheme. This PR adds support for the new scheme.

This means that namespaced Executors can now be used with the jinaai scheme in the following way:

from jina import Flow

flow = Flow().add(uses='jinaai://jina-ai/DummyHubExecutor')

This scheme is also supported in Kubernetes and other APIs:

from jina import Flow

flow = Flow().add(uses='jinaai+docker://jina-ai/DummyHubExecutor')
flow.to_kubernetes_yaml('output_path', k8s_namespace='my-namespace')

The support of the new scheme means the minimum supported version of jina-hubble-sdk
has been increased to 0.26.10.

Add auto-reloading to Flow and Executor on file changes (#5461, #5488 and #5514)

A new argument reload has been added to the Flow and Executor APIs, which automatically reloads running Flows and Executors when changes are made to Executor source code or YAML configurations of Flows and Executors.

Although this feature is only meant for development, it aims to help developers iterate fast and automatically update Flows with changes they make live during development.

Find out more about this feature in these two sections:

Expand Executor serve parameters (#5494)

The method Executor.serve can receive more parameters, similar to what the Flow API expects. With new parameters to control serving and deployment configurations of the Executor, this method empowers the Executor to be convenient for single service tasks.

This means you can not only build advanced microservices-based pipelines and applications, but also build individual services with all Jina features: shards/replicas, dynamic batching, auto-reload, etc.

Add gRPC trailing metadata when logging gRPC error (#5512)

When logging gRPC errors, context trailing metadata is now shown. This helps identify underlying network issues rather than the error codes that mask multiple network errors into a single gRPC status code.

For instance, the new log message looks like the following:

DEBUG  gateway@ 1 GRPC call to deployment executor0 failed                      
       with error <AioRpcError of RPC that terminated with:                     
               status = StatusCode.UNAVAILABLE                                  
               ...                                                                   
       trailing_metadata=Metadata((('content-length', '0'),                     
       ('l5d-proxy-error', 'HTTP Balancer service in                            
       fail-fast'), ('l5d-proxy-connection', 'close'),                          
       ('date', 'Tue, 13 Dec 2022 10:20:15 GMT'))), for                         
       retry attempt 2/3. Trying next replica, if available.

The trailing_metadata returned by load balancers will help to identify the root cause more accurately.

Implement `unary_unary` stub for Gateway Runtime (#5507)

This release adds the gRPC unary_unary stub for Gateway Runtime as a new communication stub with Executors. Since the gRPC performance best practices for Python page suggests that unary stream implementation might be faster for Python, we added this communication method.

However, this is not enabled by default. The streaming RPC method will still be used unless you set the stream option to False in the Client.post() method. The feature is only effective when the gRPC protocol is used.

Read more about the feature in the documentation: https://docs.jina.ai/concepts/client/send-receive-data/#use-unary-or-streaming-grpc

Add Startup Probe and replace Readiness Probe with Liveness Probe (#5407)

Before this release, when exporting Jina Flows to Kubernetes YAML configurations, Kubernetes Readiness Probes used to be added for the Gateway pod and each Executor pod. In this release we have added a Startup Probe and replaced Readiness Probe with Liveness Probe.

Both probes use the jina ping command to check that pods are healthy.

New Jina perf Docker images (#5498)

We added a slightly larger Docker image with suffix perf which includes a set of tools useful for performance tuning and debugging.

The new image is available in Jina AI's Docker hub.

New Jina Docker image for Python 3.10, and use Python 3.8 for default Jina image (#5490)

Besides adding Docker images aimed for performance optimization, we added an image with a newer Python version: 3.10. This is available in Jina AI's Docker hub, for example jinaai/jina:master-py310-standard.

We also made Python 3.8 our minimum supported Python version by default, and it will be used for default Docker images.

Minimize output of `jina ping` command (#5476)

jina ping commands are now less verbose and will print less irrelevant output. However, important information like latency for each round, average latency, number of successful requests and ping result will still show up.

Add Kubernetes preStop hook to the containers (#5445)

A preStop hook has been added to Executors and the Gateway to allow a grace period. This allows more time to complete in-flight requests and finish the server's graceful shutdown.

Generate random ports for multiple protocols (#5455)

If you use multiple protocols for a Gateway, you no longer need to specify a port for each one. Whether it's Python or YAML, you just need to specify the protocols you want to support and Jina will generate random ports for you.

Python API:

from jina import Flow

flow = Flow().config_gateway(protocol=['grpc', 'http', 'websocket'])
with flow:
    flow.block()

YAML:

jtype: Flow
gateway:
  protocol:
    - 'grpc'
    - 'http'
    - 'websocket'

Result:

🐞 Bug Fixes

List-like args passed as string (#5464)

We fixed the format expected for port, host and port_monitoring to feel more Pythonic. Basically, if you use replicas, you no longer have to provide comma-separated ports as a string value. Instead, you can simply pass a list of values, no need to put all in a string anymore!

For instance, suppose we have two external replicas of an Executor that we want to join in our Flow (the first is hosted on localhost:12345 and the second on 91.198.174.192:12346). We can add them like this:

from jina import Flow

replica_hosts, replica_ports = ['localhost', '91.198.174.192'], [
    '12345',
    '12346',
]  # instead of 'localhost,91.198.174.192', '12345,12346'
Flow().add(host=replica_hosts, port=replica_ports, external=True)

Or:

Flow().add(host=['localhost:12345', '91.198.174.192:12346'], external=True)

Note that this is not a breaking change, and the old syntax (comma-separated values: Flow().add(host='localhost:12345,91.198.174.192:12346', external=True)) is still supported for backwards compatibility.

Restore port to overload type hint and JSON schema (#5501)

When we made port and protocol arguments of the Gateway support multiple values, a bug was introduced where port did not appear in Jina's JSON schema as well as the Flow API overload for method signatures.

Although the arguments are functional in both the Python API and YAML, this suppressed auto-completion and developer support for these parameters. This release restores the port parameter in both the Flow method overloads and JSON schema.

Do not force `insecure` to `True` in open telemetry integration (#5483)

In Jina's instrumentation, communication to open telemetry exporters used to be forced to insecure mode. Luckily, our community member @big-thousand picked this up and submitted a fix. The communication is no longer forced to the insecure mode.

Kudos to @big-thousand for his contribution!

Fix problem when using floating Executor in HTTP (#5493)

We found a bug when using Floating Executors in HTTP, where the floating Executor is connected to the Gateway (in the Flow topology). In this case, the Executor would not receive input Documents properly. This release fixes the mentioned bug.

Add egg info post install command for egg info setup mode (#5491)

This release adds support for the egg info setup mode in Python. This means post-installation commands are now properly executed in environments that rely on Python's new setup mode.

This bug resulted in several issues especially for environments that depend on these post-installation commands. For instance, some Environment Variables that are needed for Jina to work on macOS and for CLI auto-complete.

Do not apply limits when `gpus='all'` in Kubernetes (#5485)

If Executor parameter gpus is set to "all", no limits will be applied on the pod in Kubernetes.

Fix Windows signal handling (#5484)

This release improves signal handling on Windows, specifically when cancelling a Flow with an OS signal.

Cap `opentelemetry-instrumentation-aiohttp-client` (#5452)

This release caps the version for opentelemetry-instrumentation-aiohttp-client which is incompatible with opentelemetry-semantic-conventions.

Raise exceptions from path importer (#5447)

Previously, errors were hidden when they came from a Python module imported to load an Executor. Actually the module was not considered to be a Python module, which produced other misleading errors. In this release, actual errors during imports will be raised and no longer hidden.

📗 Documentation Improvements

Add gRPC requirements for Apple Silicon (M1 Chip) to fix failing installation of Jina (#5511)
Add redirects from '/fundamentals' to '/concepts' (#5504)
Update JCloud documentation to the jcloud v0.1.0 (#5385)
Restructure documentation under /concepts
Change Executor URI scheme to namespaced scheme jinaai (#5450)
Custom Gateway documentation (#5465)
Provide more accurate description for port and protocol parameters of the Gateway (#5456)

🤟 Contributors

We would like to thank all contributors to this release:

Delgermurun (@delgermurun)
Jie Fu (@jemmyshin)
Alex Cureton-Griffiths (@alexcg1)
big-thousand (@big-thousand)
IyadhKhalfallah (@IyadhKhalfallah)
Deepankar Mahapatro (@deepankarm)
samsja (@samsja)
AlaeddineAbdessalem (@alaeddine-13)
Joan Fontanals (@JoanFM)
Anne Yang (@AnneYang720)
Han Xiao (@hanxiao)
Girish Chandrashekar (@girishc13)
Jackmin801 (@Jackmin801)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

💫 Release v3.13.0

Release Note (`3.13.0`)

🆕 Features

Custom Gateways (#5153, #5189, #5342, #5457, #5465, #5472 and #5477)

Dynamic batching (#5410)

Install requirements of local Executors (#5508)

Support `jinaai` Executor scheme to enable namespaced Hub Executors (#5462, #5468 and #5515)

Add auto-reloading to Flow and Executor on file changes (#5461, #5488 and #5514)

Expand Executor serve parameters (#5494)

Add gRPC trailing metadata when logging gRPC error (#5512)

Implement `unary_unary` stub for Gateway Runtime (#5507)

Add Startup Probe and replace Readiness Probe with Liveness Probe (#5407)

New Jina perf Docker images (#5498)

New Jina Docker image for Python 3.10, and use Python 3.8 for default Jina image (#5490)

Minimize output of `jina ping` command (#5476)

Add Kubernetes preStop hook to the containers (#5445)

Generate random ports for multiple protocols (#5455)

🐞 Bug Fixes

List-like args passed as string (#5464)

Restore port to overload type hint and JSON schema (#5501)

Do not force `insecure` to `True` in open telemetry integration (#5483)

Fix problem when using floating Executor in HTTP (#5493)

Add egg info post install command for egg info setup mode (#5491)

Do not apply limits when `gpus='all'` in Kubernetes (#5485)

Fix Windows signal handling (#5484)

Cap `opentelemetry-instrumentation-aiohttp-client` (#5452)

Raise exceptions from path importer (#5447)

📗 Documentation Improvements

🤟 Contributors

Contributors

💫 Release v3.13.0

Release Note (3.13.0)

🆕 Features

Custom Gateways (#5153, #5189, #5342, #5457, #5465, #5472 and #5477)

Dynamic batching (#5410)

Install requirements of local Executors (#5508)

Support jinaai Executor scheme to enable namespaced Hub Executors (#5462, #5468 and #5515)

Add auto-reloading to Flow and Executor on file changes (#5461, #5488 and #5514)

Expand Executor serve parameters (#5494)

Add gRPC trailing metadata when logging gRPC error (#5512)

Implement unary_unary stub for Gateway Runtime (#5507)

Add Startup Probe and replace Readiness Probe with Liveness Probe (#5407)

New Jina perf Docker images (#5498)

New Jina Docker image for Python 3.10, and use Python 3.8 for default Jina image (#5490)

Minimize output of jina ping command (#5476)

Add Kubernetes preStop hook to the containers (#5445)

Generate random ports for multiple protocols (#5455)

🐞 Bug Fixes

List-like args passed as string (#5464)

Restore port to overload type hint and JSON schema (#5501)

Do not force insecure to True in open telemetry integration (#5483)

Fix problem when using floating Executor in HTTP (#5493)

Add egg info post install command for egg info setup mode (#5491)

Do not apply limits when gpus='all' in Kubernetes (#5485)

Fix Windows signal handling (#5484)

Cap opentelemetry-instrumentation-aiohttp-client (#5452)

Raise exceptions from path importer (#5447)

📗 Documentation Improvements

🤟 Contributors

Contributors

Release Note (`3.13.0`)

Support `jinaai` Executor scheme to enable namespaced Hub Executors (#5462, #5468 and #5515)

Implement `unary_unary` stub for Gateway Runtime (#5507)

Minimize output of `jina ping` command (#5476)

Do not force `insecure` to `True` in open telemetry integration (#5483)

Do not apply limits when `gpus='all'` in Kubernetes (#5485)

Cap `opentelemetry-instrumentation-aiohttp-client` (#5452)