Skip to content

Releases: jina-ai/serve

💫 Release v3.21.0

26 Sep 15:55
9b5bd89
Compare
Choose a tag to compare

Release Note (3.21.0)

Release time: 2023-09-26 15:54:28

This release contains 1 new feature, 4 bug fixes, and 1 documentation improvement.

🆕 Features

Add return_type parameter to gateway streamer methods (#6027)

By default, GatewayStreamer will fetch executor input and output schemas and reconstruct docarray models dynamically. The output schemas will be used to cast the responses at the gateway level.

Although the cast responses are nearly identical to the original schemas defined at the executor level, they might fail some checks. For example, if the gateway receives a document doc from an executor with output MyDoc, the following check will fail:

assert isinstance(doc, MyDoc)

Similarly, adding doc to a DocList[MyDoc] will fail the type checks.

To prevent this, the user of a GatewayStreamer can now use the return_type parameter to explicitly specify the type that will be used to cast the response. Output responses received from gateway streamer methods will always match the specified return type.

🐞 Bug Fixes

Fix topology schema validation (#6057)

Previously, if the endpoint model schemas of different Executors in a Flow did not match exactly, the Flow would fail to start, Even if the difference was as small as a different default value, it would still give rise to this error.

We have fixed this bug by relaxing the model schema checking to only verify that the types of properties match.

Fix consensus module memory leak (#6054)

In the consensus golang module, some allocated strings were not being properly released. We have repaired this.

Document casting in Flow gateway (#6032)

This bug is related to #6027. We now use the return_type parameter in GatewayStreamer to ensure that Documents received at the gateway level are properly cast to the correct schema. This prevents validation and serialization errors that previously occurred.

Remove sandbox (#6047)

Remove support for deploying Executors in the Jina Cloud sandbox, since the sandbox has been deprecated.

📗 Documentation Improvements

  • Make example copy-pastable (#6052)

🤟 Contributors

We would like to thank all contributors to this release:

💫 Patch v3.20.3

07 Sep 08:48
Compare
Choose a tag to compare

Release Note (3.20.3)

Release time: 2023-09-07 08:46:43

This release contains 1 bug fix.

🐞 Bug Fixes

Skip doc attributes in annotations but not in fields (#6035)

When deploying an Executor inside a Flow with a BaseDoc model that has any attribute with a ClassVar value, the service would fail to initialize because the Gateway could not properly create the schemas. We have fixed this by securing access to __fields__ when dynamically creating these pydantic models.

🤟 Contributors

We would like to thank all contributors to this release:

💫 Patch v3.20.2

06 Sep 07:38
0569354
Compare
Choose a tag to compare

Release Note (3.20.2)

Release time: 2023-09-06 07:37:25

This release contains 1 bug fix and 1 refactoring.

🐞 Bug Fixes

Fix install issue (#6037)

Fix an installation issue that appeared because a new release of opentelemetry-sdk prevented pip from finding compatible libraries.

⚙ Refactoring

Refactor the setup.py file (#6038)

Some hard-coded strings were replaced with constants to improve code readability.

🤟 Contributors

We would like to thank all contributors to this release:

💫 Patch v3.20.1

10 Aug 15:51
0a33cde
Compare
Choose a tag to compare

Release Note (3.20.1)

Release time: 2023-08-10 15:49:12

This release contains 2 bug fixes and 2 documentation improvements.

🐞 Bug Fixes

Make Gateway load balancer stream results (#6024)

Streaming endpoints in Executors can be deployed behind a Gateway (when using include_gateway=True in Deployment).

In this case, the Gateway acts as a load balancer. However, prior to this release, when the HTTP protocol is used, the Gateway would wait until all chunks of the responses had been streamed from the Executor.

Only when all the chunks were received would it send them back to the client. This resulted in delays and suppressed the desired behavior of a streaming endpoint (namely, displaying tokens streamed from an LLM with a typewriter effect).
This release fixes this issue by making the Gateway stream chunks of responses from the forwarded request as soon as they are received from the Executor.

No matter whether you are setting include_gateway to True or False in Deployment, streaming endpoints should give the same desired behavior.

Fix deeply nested Schemas support in Executors and Flows(#6021)

When a deeply nested schema (DocArray schema with 2+ levels of nesting) was specified as an input or output of an Executor endpoint, and the Executor was deployed in a Flow, the Gateway would fail to fetch information about the endpoints and their input/output schemas:

from typing import List, Optional
from docarray import BaseDoc, DocList
from jina import Executor, Flow, requests

class Nested2Doc(BaseDoc):
    value: str


class Nested1Doc(BaseDoc):
    nested: Nested2Doc


class RootDoc(BaseDoc):
    nested: Optional[Nested1Doc]
    
class NestedSchemaExecutor(Executor):
    @requests(on='/endpoint')
    async def endpoint(self, docs: DocList[RootDoc], **kwargs) -> DocList[RootDoc]:
        rets = DocList[RootDoc]()
        rets.append(
            RootDoc(
                text='hello world', nested=Nested1Doc(nested=Nested2Doc(value='test'))
            )
        )
        return rets
    
flow = Flow().add(uses=NestedSchemaExecutor)
with flow:
    res = flow.post(
        on='/endpoint', inputs=RootDoc(text='hello'), return_type=DocList[RootDoc]
    )
...
2023-08-07 02:49:32,529 topology_graph.py[608] WARNING Getting endpoints failed: 'definitions'. Waiting for another trial

This was due to an internal utility function failing to convert such deeply nested JSON schemas to DocArray models.
This release fixes the issue by propagating global schema definitions when generating models for nested schemas.

📗 Documentation Improvements

  • Remove extra backtick in create-app.md (#6023)
  • Fix streaming endpoint reference in README (#6017)

🤟 Contributors

We would like to thank all contributors to this release:

💫 Release v3.20.0

04 Aug 08:59
Compare
Choose a tag to compare

Release Note (3.20.0)

Release time: 2023-08-04 08:58:38
This release contains 5 new features 3 bug fixes and 8 documentation improvements.

🆕 Features

Executor can work on single documents (#5991)

Executors no longer need to work solely on a DocList, but can expose endpoints for working on single documents.

For this, the method decorated by requests must take a doc argument and an annotation for the input and output types.

from jina import Executor, requests
from docarray import BaseDoc

class MyInputDocument(BaseDoc):
    num: int

class MyOutputDocument(BaseDoc):
    label: str

class MyExecutor(Executor):
    @requests(on='/hello')
    async def task(self, doc: MyInputDocument, **kwargs) -> MyOutputDocument:
        return MyOutputDocument(label='even' if doc.num % 2 == 0 else 'odd')

This keeps Executor code clean, especially for serving models that can't benefit from working on batches of documents at the same time.

Parameters can be described as Pydantic models (#6001)

An Executor's parameters argument can now be a Pydantic model rather than a plain Python dictionary. To use a Pydantic model, the parameters argument needs to have the model as a type annotation.

Defining parameters as a Pydantic model instead of a simple Dict has two main benefits:

  • Validation and default values: You can get validation of the parameters that the Executor expected before the Executor may access any invalid key. You can also easily define defaults.
  • Descriptive OpenAPI definition when using the HTTP protocol.

Expose richer OpenAPI when serving Executor with HTTP inside a Flow (#5992)

Executors served with Deployments and Flows can now provide a descriptive OpenAPI when using HTTP. The description, examples and other relevant fields are used in the Gateway to provide a complete API.

Support streaming in single-Executor Flows (#5988)

Streaming endpoints now also support the Flow orchestration layer and it is no longer mandatory to use just a Deployment.
A Flow orchestration can accept streaming endpoints under both the gRPC and HTTP protocols.

with Flow(protocol=protocol, port=port, cors=True).add(
    uses=StreamingExecutor,
):
    client = Client(port=port, protocol=protocol, asyncio=True)
    i = 10
    async for doc in client.stream_doc(
            on='/hello',
            inputs=MyDocument(text='hello world', number=i),
            return_type=MyDocument,
    ):
        print(doc)

Streaming endpoints with gRPC protocol (#5921)

After adding SSE support to allow streaming documents one by one for the HTTP protocol, we added the same functionality for the gRPC protocol. A Jina server can now stream single Documents to a client, one at a time, using gRPC. This feature relies on streaming gRPC endpoints under the hood.

One typical use-case of this feature is streaming tokens from a Large Language Model. For instance, check our how-to on streaming LLM tokens.

from jina import Executor, requests, Deployment
from docarray import BaseDoc

# first define schemas
class MyDocument(BaseDoc):
    text: str

# then define the Executor
class MyExecutor(Executor):

    @requests(on='/hello')
    async def task(self, doc: MyDocument, **kwargs) -> MyDocument:
        for i in range(100):
            yield MyDocument(text=f'hello world {i}')
            
with Deployment(
    uses=MyExecutor,
    port=12345,
    protocol='grpc', # or 'http'
) as dep:
    dep.block()

From the client side, you can use the new stream_doc() method to receive documents one by one:

from jina import Client, Document

client = Client(port=12345, protocol='grpc', asyncio=True)
async for doc in client.stream_doc(
    on='/hello', inputs=MyDocument(text='hello world'), return_type=MyDocument
):
    print(doc.text)

Read more in the docs.

🐞 Bug Fixes

Fix caching models from all endpoints, inputs and outputs (#6005)

An issue was fixed that caused problems when using an Executor inside a Flow where the same document type was used as input and output in different endpoints.

Use 127.0.0.1 as local ctrl address (#6004)

The orchestration layer will use 127.0.0.1 to send health checks to Executors and Gateways when working locally. It previously used 0.0.0.0 as the default host and this caused issues in some configurations.

Ignore warnings from Google (#5968)

Warnings that used to appear in relation to the pkg_resources deprecated API are now suppressed.

📗 Documentation Improvements

  • Fix errors in getting started and preliminaries (#6008)
  • Add note about Flow with one Executor supported (#5990)
  • Add docs for secrets and jobs (#5948)
  • Remove include gateway arg (#5987)
  • Add Kubernetes hint to port ignore (#5985)
  • Add docs for streaming (#5980)
  • Add docs for jcloud horizontal pod autoscale (#5957)
  • Update docs to show single document serving (#6009)

🤟 Contributors

We would like to thank all contributors to this release:

💫 Patch v3.19.1

19 Jul 07:55
eda0fbb
Compare
Choose a tag to compare

Release Note (3.19.1)

Release time: 2023-07-19 07:54:28

This release contains 5 bug fixes.

🐞 Bug Fixes

Dynamic batching with docarray>=0.30 (#5970)

Dynamic batching requests for Executors were not working when using docarray>=0.30.0. This fix makes this feature fully compatible with newer versions of DocArray.

Monitoring validation error with docarray>=0.30 (#5965)

When using docarray>=0.30 in combination with monitoring, there was a risk of getting a validation error because the input and output schemas were not properly considered.

Fail fast when no valid schemas (#5962)

When no valid schemas were used, Executors sometimes failed to load, but the Gateway would continue to try getting the endpoints from them until it timed out. Now, everything will stop faster without the long wait.

Properly handle multiprotocol Deployments to Kubernetes (#5961)

When converting a Deployment using multiple protocols to Kubernetes YAML, the resulting services did not use or expose the ports and protocols as expected. This has now been fixed.

Fix gRPC connectivity issues for health check (#5972)

Fixed issues for Flow not being able to health-check Executors due to HTTP proxy used.

Now this is changed, and when doing:

from jina import Deployment

d = Deployment(protocol=['grpc', 'http'])
d.to_kubernetes_yaml('./k8s-deployment')

You will get a YAML where the two services expose each protocol.

🤟 Contributors

We would like to thank all contributors to this release:

💫 Release v3.19.0

10 Jul 09:03
f9bf278
Compare
Choose a tag to compare

Release Note (3.19.0)

Release time: 2023-07-10 09:01:16

This release contains 3 new features, 4 bug fixes, and 3 documentation improvements.

🆕 Features

Jina is now compatible with all versions of DocArray. Unpin version in requirements (#5941)

Jina is now fully compatible with docarray>=0.30, which uncaps the version requirement.

By default, Jina will install the latest DocArray version, however, it remains compatible with the older version. If you still want to use the old version and syntax, manually install docarray<0.30 or pin the requirement in your project.

from docarray import BaseDoc, DocList
from jina import Deployment, Executor, requests


class MyDoc(BaseDoc):
    text: str


class MyExec(Executor):
    @requests(on='/foo')
    def foo(self, docs: DocList[MyDoc], **kwargs):
        docs[0].text = 'hello world'


with Deployment().add(uses=MyExec) as d:
    docs = d.post(on='/foo', inputs=MyDoc(text='hello'), return_type=DocList[MyDoc])
    assert docs[0].text == 'hello world'

Use dynamic gateway Hubble image (#5935)

In order to make Flow compatible with both docarray>=0.30 and docarray<0.30 versions, Hubble provides utilities to adapt the jina and docarray versions to the user's system. This also requires that the gateway image used in K8s be rebuilt. To do this, we have created a Hubble image that dynamically adapts to the system's docarray version. This was necessary to provide support for all DocArray versions.

Add ìmage_pull_secrets argument to Flow to enable pulling from private registry in Kubernetes (#5952)

In order for Kubernetes to pull docker images from a private registry, users need to create secrets that are passed to the Deployments as ImagePullSecrets .

Jina now provides an image_pull_secrets argument for Deployments and Flows which will make sure that those secrets are used by Kubernetes after applying to_kubernetes_yaml

from jina import Flow

f = Flow(image_pull_secrets=['regcred']).add()
f.to_kubernetes_yaml(...)

🐞 Bug Fixes

Fix validation with default endpoint (#5956)

When using docarray>=0.30. Gateway would not start because an Executor binding to the /default endpoint was connected to another that did not bind to this special endpoint. It considered this to be an incompatible topology.

We have solved this problem and this is now possible:

from jina import Flow, Executor, requests


class Encoder(Executor):
    @requests
    def encode(**kwargs):
        pass


class Indexer(Executor):
    @requests('/index')
    def index(**kwargs):
        pass

    @requests('/search')
    def search(**kwargs):
        pass


f = Flow().add(uses=Encoder).add(uses=Indexer)

with f:
    f.block()

Apply return_type when return_responses=True (#5949)

When calling client.post with arguments return_type and return_responses=True, the return_type parameter was not properly applied. This is now fixed and when accessing the docs of the Response they will have the expected type.

from jina import Executor, Deployment, requests
from docarray import DocList, BaseDoc


class InputDoc(BaseDoc):
    text: str


class OutputDoc(BaseDoc):
    len: int


class LenExecutor(Executor):
    @requests
    def foo(self, docs: DocList[InputDoc], **kwargs) -> DocList[OutputDoc]:
        ret = DocList[OutputDoc]()
        for doc in docs:
            ret.append(OutputDoc(len=len(doc.text)))
        return ret


d = Deployment(uses=LenExecutor)

with d:
    resp = d.post(
        "/",
        inputs=InputDoc(text="five"),
        return_type=DocList[OutputDoc],
        return_responses=True,
    )
    assert isinstance(resp[0].docs[0], OutputDoc)

Fix generator detection (#5947)

Jina wrongly tagged async methods as generators which should be used for single Document streaming. Now this is fixed and async methods can safely be used in Executors with docarray>=0.30.

Fix Flow.plot method (#5934)

The plot method for Flow was producing the broken URL https://mermaid.ink/. This has now been fixed.

📗 Documentation Improvements

  • adapt documentation to focus on new DocArray (#5941)
  • Text not tags in code snippets (#5930)
  • Changes for the links and hugging face model name (#5955)

🤟 Contributors

We would like to thank all contributors to this release:

💫 Release v3.18.0

22 Jun 13:34
ba7a0b3
Compare
Choose a tag to compare

Release Note (3.18.0)

Release time: 2023-06-22 13:32:50

This release contains 2 new features, 4 bug fixes, and 2 documentation improvements.

🆕 Features

Streaming single Document with HTTP SSE for Deployment (#5899)

In this release, we have added support for Server-Sent Events (SSE) to Jina's HTTP protocol with the Deployment orchestration. A Jina server can now stream single Documents to a client, one at a time. This is useful for applications that require a continuous stream of data, such as chatbots (using Large Language Models) or real-time translation.

Simply define an endpoint function that receives a single Document and yields Documents one by one:

from jina import Executor, requests, Document, Deployment

class MyExecutor(Executor):
    @requests(on='/hello')
    async def task(self, doc: Document, **kwargs):
        for i in range(3):
            yield Document(text=f'{doc.text} {i}')
            
with Deployment(
    uses=MyExecutor,
    port=12345,
    protocol='http',
    cors=True,
    include_gateway=False,
) as dep:
    dep.block()

From the client side, you can use the new stream_doc method to receive the Documents one by one:

from jina import Client, Document

client = Client(port=12345, protocol='http', cors=True, asyncio=True)
async for doc in client.stream_doc(
    on='/hello', inputs=Document(text='hello world')
):
    print(doc.text)

Note that the SSE client is language-independent. This feature also supports DocArray v2. For more information, see the streaming endpoints section of the Jina documentation.

Add env_from_secret option to Gateway (#5914)

We have added the env_from_secret parameter to Gateway to allow custom gateways to load secrets from Kubernetes when transformed to Kubernetes YAML in the same way as Executors.

from jina import Flow

f = Flow().config_gateway(env_from_secret={'SECRET_PASSWORD': {'name': 'mysecret', 'key': 'password'}}).add()
f.to_kubernetes_yaml()

🐞 Bug Fixes

Fix error working with some data types in DocArray V2 (#5905) (#5908)

We have fixed some errors when DocArray v2 documents contained flexible dictionaries, Lists, or Tensors.

Fix reloading Executor when is loaded from config.yml (#5915)

The reload option of an Executor was not working when the Executor was loaded from config.yml, and Jina was not able to update Executor code after updates.

f = Flow().add(uses='config.yml', reload=True)
with f:
    f.block()

Ensure closing Executor at shutdown in Deployment with HTTP protocol (#5906)

We have fixed a bug that prevented the close method of the Executor from being executed at shutdown when the Deployment is exposed with an HTTP server.

Fix issue in mismatch endpoint when using shards (#5904)

A KeyError was raised when working with DocArray v2 in a Deployment using shards if the endpoints were not matching.
With this fix, it will properly call the default endpoint.

from jina import Deployment, Executor, requests
from docarray import BaseDoc, DocList
from docarray.typing import NdArray
fromt typing import List

class MyDoc(BaseDoc):
    text: str
    embedding: NdArray[128]

class MyDocWithMatchesAndScores(MyDoc):
    matches: DocList[MyDoc]
    scores: List[float]

class MyExec(Executor):

    @requests
    def foo(self, docs: DocList[MyDoc], **kwargs) -> DocList[MyDocWithMatchesAndScores]:
        res = DocList[MyDocWithMatchesAndScores]()
        for doc in docs:
            new_doc = MyDocWithMatchesAndScores(text=doc.text, embedding=doc.embedding, matches=docs,
                                                scores=[1.0 for _ in docs])
            res.append(new_doc)
        return res

d = Deployment(uses=MyExec, shards=2)
with d:
    res = d.post(on='/', inputs=DocList[MyDoc]([MyDoc(text='hey ha', embedding=np.random.rand(128))]))
    assert len(res) == 1

📗 Documentation Improvements

  • README revamp to put more emphasis on generative AI use cases (#5895)
  • Fix links in docs (#5909)

🤟 Contributors

We would like to thank all contributors to this release:

💫 Release v3.17.0

06 Jun 15:26
bef7159
Compare
Choose a tag to compare

Release Note (3.17.0)

Release time: 2023-06-06 15:25:15

This release contains 1 new feature and 1 bug fix.

🆕 Features

Flows now compatible with DocArray v2 (#5861)

Finally, Flows and Deployments are fully compatible with the new DocArray version (above 0.30.0). This includes all supported protocols and features, namely http, grpc and websocket for Flow and http and grpc for Deployment.

Now you can enjoy the capacity and expressivity of the new DocArray together with the performance, scalability and richness of Jina.

from docarray import BaseDoc, DocList
from jina import Flow, Executor, requests

class MyDoc(BaseDoc):
    text: str

class MyExec(Executor):
    @requests(on='/foo')
    def foo(self, docs: DocList[MyDoc], **kwargs):
        docs[0].text = 'hello world'

ports=[12345, 12346]
protocols=['http, 'grpc']
with Flow(protocol=protocols, ports=ports).add(uses=MyExec):
    for port, protocol in zip(ports, protocols):
        c = Client(port=port, protocol=protocol)
        docs = c.post(on='/foo', inputs=MyDoc(text='hello'), return_type=DocList[MyDoc])
        assert docs[0].text == 'hello world'

While Jina is fully compatible with the new DocArray version, for now it will not install the latest version, since there remain compatibility issues with Hubble and JCloud. As soon as these are resolved, Jina will install the new version by default.

🐞 Bug Fixes

Fix instantiation of Executor with write decorator (#5897)

Fix instantiation of Executors where the first method is decorated with a @write decorator.

from jina import Executor, requests
from jina.serve.executors.decorators import write

class WriteExecutor(Executor):
    @write
    @requests(on='/delete')
    def delete(self, **kwargs):
        pass

    @requests(on='/bar')
    @write
    def bar(self, **kwargs):
        pass

🤟 Contributors

We would like to thank all contributors to this release:

💫 Patch v3.16.1

23 May 15:01
52a29ee
Compare
Choose a tag to compare

Release Note (3.16.1)

Release time: 2023-05-23 14:59:35

This patch contains 1 refactoring, 3 bug fixes and 3 documentation improvements.

⚙ Refactoring

Remove aiostream dependency (#5891)

Remove aiostream dependency which could be the root of improper licensing.

🐞 Bug Fixes

Fix usage of ports and protocols alias (#5885)

You can now use plural ports and protocols:

from jina import Flow

f = Flow(ports=[12345, 12346], protocols=['grpc', 'http']).add()

with f:
    f.block()

Previously, the arguments would correctly be applied to the inner Deployments.

Fix endpoint printing (#5884)

Fix endpoint printing when no ports are specified with multi-protocol Deployments:

from jina import Deployment, Executor, requests, Client

d = Deployment(protocols=['grpc', 'http'])

with d:
    pass

Before:
image

After:

image

Fix docs and redocs links (#5883)

Fix the printed links to docs and redocs pages when HTTP protocol is used in combination with other protocols.

from jina import Deployment, Executor, requests, Client
d = Deployment(protocol=['grpc', 'http'], port=[12345, 12346])
with d:
    pass

Old behavior:

image

New behavior:

image

📗 Documentation Improvements

  • Fix import in notebook causing a crash (#5888)
  • Add docs for Jina Cloud logs (#5892)
  • Fix YAML specs links (#5887)

🤟 Contributors

We would like to thank all contributors to this release: