Releases: jina-ai/serve
💫 Release v3.21.0
Release Note (3.21.0
)
Release time: 2023-09-26 15:54:28
This release contains 1 new feature, 4 bug fixes, and 1 documentation improvement.
🆕 Features
Add return_type
parameter to gateway streamer methods (#6027)
By default, GatewayStreamer
will fetch executor input and output schemas and reconstruct docarray models dynamically. The output schemas will be used to cast the responses at the gateway level.
Although the cast responses are nearly identical to the original schemas defined at the executor level, they might fail some checks. For example, if the gateway receives a document doc
from an executor with output MyDoc
, the following check will fail:
assert isinstance(doc, MyDoc)
Similarly, adding doc
to a DocList[MyDoc]
will fail the type checks.
To prevent this, the user of a GatewayStreamer
can now use the return_type
parameter to explicitly specify the type that will be used to cast the response. Output responses received from gateway streamer methods will always match the specified return type.
🐞 Bug Fixes
Fix topology schema validation (#6057)
Previously, if the endpoint model schemas of different Executors in a Flow did not match exactly, the Flow would fail to start, Even if the difference was as small as a different default value, it would still give rise to this error.
We have fixed this bug by relaxing the model schema checking to only verify that the types of properties match.
Fix consensus module memory leak (#6054)
In the consensus golang
module, some allocated strings were not being properly released. We have repaired this.
Document casting in Flow gateway (#6032)
This bug is related to #6027. We now use the return_type
parameter in GatewayStreamer
to ensure that Documents received at the gateway level are properly cast to the correct schema. This prevents validation and serialization errors that previously occurred.
Remove sandbox (#6047)
Remove support for deploying Executors in the Jina Cloud sandbox, since the sandbox has been deprecated.
📗 Documentation Improvements
- Make example copy-pastable (#6052)
🤟 Contributors
We would like to thank all contributors to this release:
- Joan Fontanals (@JoanFM )
- Naymul Islam (@ai-naymul )
- Deepankar Mahapatro (@deepankarm )
- AlaeddineAbdessalem (@alaeddine-13 )
- Han Xiao (@hanxiao )
- XXIV (@thechampagne )
💫 Patch v3.20.3
Release Note (3.20.3
)
Release time: 2023-09-07 08:46:43
This release contains 1 bug fix.
🐞 Bug Fixes
Skip doc attributes in annotations but not in fields (#6035)
When deploying an Executor inside a Flow with a BaseDoc
model that has any attribute with a ClassVar
value, the service would fail to initialize because the Gateway could not properly create the schemas. We have fixed this by securing access to __fields__
when dynamically creating these pydantic models.
🤟 Contributors
We would like to thank all contributors to this release:
- Narek Amirbekian (@NarekA )
💫 Patch v3.20.2
Release Note (3.20.2
)
Release time: 2023-09-06 07:37:25
This release contains 1 bug fix and 1 refactoring.
🐞 Bug Fixes
Fix install issue (#6037)
Fix an installation issue that appeared because a new release of opentelemetry-sdk
prevented pip
from finding compatible libraries.
⚙ Refactoring
Refactor the setup.py file (#6038)
Some hard-coded strings were replaced with constants to improve code readability.
🤟 Contributors
We would like to thank all contributors to this release:
- Joan Fontanals (@JoanFM )
- Naymul Islam (@ai-naymul )
💫 Patch v3.20.1
Release Note (3.20.1
)
Release time: 2023-08-10 15:49:12
This release contains 2 bug fixes and 2 documentation improvements.
🐞 Bug Fixes
Make Gateway load balancer stream results (#6024)
Streaming endpoints in Executors can be deployed behind a Gateway (when using include_gateway=True
in Deployment
).
In this case, the Gateway acts as a load balancer. However, prior to this release, when the HTTP
protocol is used, the Gateway would wait until all chunks of the responses had been streamed from the Executor.
Only when all the chunks were received would it send them back to the client. This resulted in delays and suppressed the desired behavior of a streaming endpoint (namely, displaying tokens streamed from an LLM with a typewriter effect).
This release fixes this issue by making the Gateway stream chunks of responses from the forwarded request as soon as they are received from the Executor.
No matter whether you are setting include_gateway
to True
or False
in Deployment
, streaming endpoints should give the same desired behavior.
Fix deeply nested Schemas support in Executors and Flows(#6021)
When a deeply nested schema (DocArray schema with 2+ levels of nesting) was specified as an input or output of an Executor endpoint, and the Executor was deployed in a Flow, the Gateway would fail to fetch information about the endpoints and their input/output schemas:
from typing import List, Optional
from docarray import BaseDoc, DocList
from jina import Executor, Flow, requests
class Nested2Doc(BaseDoc):
value: str
class Nested1Doc(BaseDoc):
nested: Nested2Doc
class RootDoc(BaseDoc):
nested: Optional[Nested1Doc]
class NestedSchemaExecutor(Executor):
@requests(on='/endpoint')
async def endpoint(self, docs: DocList[RootDoc], **kwargs) -> DocList[RootDoc]:
rets = DocList[RootDoc]()
rets.append(
RootDoc(
text='hello world', nested=Nested1Doc(nested=Nested2Doc(value='test'))
)
)
return rets
flow = Flow().add(uses=NestedSchemaExecutor)
with flow:
res = flow.post(
on='/endpoint', inputs=RootDoc(text='hello'), return_type=DocList[RootDoc]
)
...
2023-08-07 02:49:32,529 topology_graph.py[608] WARNING Getting endpoints failed: 'definitions'. Waiting for another trial
This was due to an internal utility function failing to convert such deeply nested JSON schemas to DocArray models.
This release fixes the issue by propagating global schema definitions when generating models for nested schemas.
📗 Documentation Improvements
🤟 Contributors
We would like to thank all contributors to this release:
- Saba Sturua (@jupyterjazz)
- AlaeddineAbdessalem (@alaeddine-13)
- Joan Fontanals (@JoanFM)
- Naymul Islam (@ai-naymul)
💫 Release v3.20.0
Release Note (3.20.0
)
Release time: 2023-08-04 08:58:38
This release contains 5 new features 3 bug fixes and 8 documentation improvements.
🆕 Features
Executor can work on single documents (#5991)
Executors no longer need to work solely on a DocList
, but can expose endpoints for working on single documents.
For this, the method decorated by requests
must take a doc
argument and an annotation for the input and output types.
from jina import Executor, requests
from docarray import BaseDoc
class MyInputDocument(BaseDoc):
num: int
class MyOutputDocument(BaseDoc):
label: str
class MyExecutor(Executor):
@requests(on='/hello')
async def task(self, doc: MyInputDocument, **kwargs) -> MyOutputDocument:
return MyOutputDocument(label='even' if doc.num % 2 == 0 else 'odd')
This keeps Executor code clean, especially for serving models that can't benefit from working on batches of documents at the same time.
Parameters can be described as Pydantic models (#6001)
An Executor's parameters
argument can now be a Pydantic model rather than a plain Python dictionary. To use a Pydantic model, the parameters
argument needs to have the model as a type annotation.
Defining parameters
as a Pydantic model instead of a simple Dict has two main benefits:
- Validation and default values: You can get validation of the parameters that the Executor expected before the Executor may access any invalid key. You can also easily define defaults.
- Descriptive OpenAPI definition when using the HTTP protocol.
Expose richer OpenAPI when serving Executor with HTTP inside a Flow (#5992)
Executors served with Deployments and Flows can now provide a descriptive OpenAPI when using HTTP. The description, examples and other relevant fields are used in the Gateway to provide a complete API.
Support streaming in single-Executor Flows (#5988)
Streaming endpoints now also support the Flow orchestration layer and it is no longer mandatory to use just a Deployment.
A Flow orchestration can accept streaming endpoints under both the gRPC and HTTP protocols.
with Flow(protocol=protocol, port=port, cors=True).add(
uses=StreamingExecutor,
):
client = Client(port=port, protocol=protocol, asyncio=True)
i = 10
async for doc in client.stream_doc(
on='/hello',
inputs=MyDocument(text='hello world', number=i),
return_type=MyDocument,
):
print(doc)
Streaming endpoints with gRPC protocol (#5921)
After adding SSE support to allow streaming documents one by one for the HTTP protocol, we added the same functionality for the gRPC protocol. A Jina server can now stream single Documents to a client, one at a time, using gRPC. This feature relies on streaming gRPC endpoints under the hood.
One typical use-case of this feature is streaming tokens from a Large Language Model. For instance, check our how-to on streaming LLM tokens.
from jina import Executor, requests, Deployment
from docarray import BaseDoc
# first define schemas
class MyDocument(BaseDoc):
text: str
# then define the Executor
class MyExecutor(Executor):
@requests(on='/hello')
async def task(self, doc: MyDocument, **kwargs) -> MyDocument:
for i in range(100):
yield MyDocument(text=f'hello world {i}')
with Deployment(
uses=MyExecutor,
port=12345,
protocol='grpc', # or 'http'
) as dep:
dep.block()
From the client side, you can use the new stream_doc()
method to receive documents one by one:
from jina import Client, Document
client = Client(port=12345, protocol='grpc', asyncio=True)
async for doc in client.stream_doc(
on='/hello', inputs=MyDocument(text='hello world'), return_type=MyDocument
):
print(doc.text)
🐞 Bug Fixes
Fix caching models from all endpoints, inputs and outputs (#6005)
An issue was fixed that caused problems when using an Executor inside a Flow where the same document type was used as input and output in different endpoints.
Use 127.0.0.1
as local ctrl address (#6004)
The orchestration layer will use 127.0.0.1
to send health checks to Executors and Gateways when working locally. It previously used 0.0.0.0
as the default host and this caused issues in some configurations.
Ignore warnings from Google (#5968)
Warnings that used to appear in relation to the pkg_resources
deprecated API are now suppressed.
📗 Documentation Improvements
- Fix errors in getting started and preliminaries (#6008)
- Add note about Flow with one Executor supported (#5990)
- Add docs for secrets and jobs (#5948)
- Remove include gateway arg (#5987)
- Add Kubernetes hint to port ignore (#5985)
- Add docs for streaming (#5980)
- Add docs for jcloud horizontal pod autoscale (#5957)
- Update docs to show single document serving (#6009)
🤟 Contributors
We would like to thank all contributors to this release:
- Joan Fontanals (@JoanFM)
- Nikolas Pitsillos (@npitsillos)
- Alex Cureton-Griffiths (@alexcg1)
- Deepankar Mahapatro (@deepankarm)
- Winston Wong (@winstonww)
- AlaeddineAbdessalem (@alaeddine-13)
- Zhaofeng Miao (@mapleeit)
💫 Patch v3.19.1
Release Note (3.19.1
)
Release time: 2023-07-19 07:54:28
This release contains 5 bug fixes.
🐞 Bug Fixes
Dynamic batching with docarray>=0.30 (#5970)
Dynamic batching requests for Executors were not working when using docarray>=0.30.0. This fix makes this feature fully compatible with newer versions of DocArray.
Monitoring validation error with docarray>=0.30 (#5965)
When using docarray>=0.30 in combination with monitoring, there was a risk of getting a validation error because the input and output schemas were not properly considered.
Fail fast when no valid schemas (#5962)
When no valid schemas were used, Executors sometimes failed to load, but the Gateway would continue to try getting the endpoints from them until it timed out. Now, everything will stop faster without the long wait.
Properly handle multiprotocol Deployments to Kubernetes (#5961)
When converting a Deployment
using multiple protocols to Kubernetes YAML, the resulting services did not use or expose the ports and protocols as expected. This has now been fixed.
Fix gRPC connectivity issues for health check (#5972)
Fixed issues for Flow not being able to health-check Executors due to HTTP proxy used.
Now this is changed, and when doing:
from jina import Deployment
d = Deployment(protocol=['grpc', 'http'])
d.to_kubernetes_yaml('./k8s-deployment')
You will get a YAML where the two services expose each protocol.
🤟 Contributors
We would like to thank all contributors to this release:
- Joan Fontanals (@JoanFM)
💫 Release v3.19.0
Release Note (3.19.0
)
Release time: 2023-07-10 09:01:16
This release contains 3 new features, 4 bug fixes, and 3 documentation improvements.
🆕 Features
Jina is now compatible with all versions of DocArray. Unpin version in requirements (#5941)
Jina is now fully compatible with docarray>=0.30
, which uncaps the version requirement.
By default, Jina will install the latest DocArray version, however, it remains compatible with the older version. If you still want to use the old version and syntax, manually install docarray<0.30
or pin the requirement in your project.
from docarray import BaseDoc, DocList
from jina import Deployment, Executor, requests
class MyDoc(BaseDoc):
text: str
class MyExec(Executor):
@requests(on='/foo')
def foo(self, docs: DocList[MyDoc], **kwargs):
docs[0].text = 'hello world'
with Deployment().add(uses=MyExec) as d:
docs = d.post(on='/foo', inputs=MyDoc(text='hello'), return_type=DocList[MyDoc])
assert docs[0].text == 'hello world'
Use dynamic gateway Hubble image (#5935)
In order to make Flow
compatible with both docarray>=0.30
and docarray<0.30
versions, Hubble provides utilities to adapt the jina
and docarray
versions to the user's system. This also requires that the gateway
image used in K8s
be rebuilt. To do this, we have created a Hubble image that dynamically adapts to the system's docarray
version. This was necessary to provide support for all DocArray versions.
Add ìmage_pull_secrets
argument to Flow
to enable pulling from private registry in Kubernetes (#5952)
In order for Kubernetes to pull docker images from a private registry, users need to create secrets that are passed to the Deployments as ImagePullSecrets
.
Jina now provides an image_pull_secrets
argument for Deployments
and Flows
which will make sure that those secrets are used by Kubernetes after applying to_kubernetes_yaml
from jina import Flow
f = Flow(image_pull_secrets=['regcred']).add()
f.to_kubernetes_yaml(...)
🐞 Bug Fixes
Fix validation with default endpoint (#5956)
When using docarray>=0.30
. Gateway would not start because an Executor binding to the /default
endpoint was connected to another that did not bind to this special endpoint. It considered this to be an incompatible topology.
We have solved this problem and this is now possible:
from jina import Flow, Executor, requests
class Encoder(Executor):
@requests
def encode(**kwargs):
pass
class Indexer(Executor):
@requests('/index')
def index(**kwargs):
pass
@requests('/search')
def search(**kwargs):
pass
f = Flow().add(uses=Encoder).add(uses=Indexer)
with f:
f.block()
Apply return_type
when return_responses=True
(#5949)
When calling client.post
with arguments return_type
and return_responses=True
, the return_type
parameter was not properly applied. This is now fixed and when accessing the docs
of the Response they will have the expected type.
from jina import Executor, Deployment, requests
from docarray import DocList, BaseDoc
class InputDoc(BaseDoc):
text: str
class OutputDoc(BaseDoc):
len: int
class LenExecutor(Executor):
@requests
def foo(self, docs: DocList[InputDoc], **kwargs) -> DocList[OutputDoc]:
ret = DocList[OutputDoc]()
for doc in docs:
ret.append(OutputDoc(len=len(doc.text)))
return ret
d = Deployment(uses=LenExecutor)
with d:
resp = d.post(
"/",
inputs=InputDoc(text="five"),
return_type=DocList[OutputDoc],
return_responses=True,
)
assert isinstance(resp[0].docs[0], OutputDoc)
Fix generator detection (#5947)
Jina wrongly tagged async methods as generators which should be used for single Document streaming. Now this is fixed and async methods can safely be used in Executors with docarray>=0.30.
Fix Flow.plot
method (#5934)
The plot
method for Flow
was producing the broken URL https://mermaid.ink/. This has now been fixed.
📗 Documentation Improvements
- adapt documentation to focus on new DocArray (#5941)
- Text not tags in code snippets (#5930)
- Changes for the links and hugging face model name (#5955)
🤟 Contributors
We would like to thank all contributors to this release:
💫 Release v3.18.0
Release Note (3.18.0
)
Release time: 2023-06-22 13:32:50
This release contains 2 new features, 4 bug fixes, and 2 documentation improvements.
🆕 Features
Streaming single Document with HTTP SSE for Deployment (#5899)
In this release, we have added support for Server-Sent Events (SSE) to Jina's HTTP protocol with the Deployment
orchestration. A Jina server can now stream single Documents to a client, one at a time. This is useful for applications that require a continuous stream of data, such as chatbots (using Large Language Models) or real-time translation.
Simply define an endpoint function that receives a single Document and yields Documents one by one:
from jina import Executor, requests, Document, Deployment
class MyExecutor(Executor):
@requests(on='/hello')
async def task(self, doc: Document, **kwargs):
for i in range(3):
yield Document(text=f'{doc.text} {i}')
with Deployment(
uses=MyExecutor,
port=12345,
protocol='http',
cors=True,
include_gateway=False,
) as dep:
dep.block()
From the client side, you can use the new stream_doc
method to receive the Documents one by one:
from jina import Client, Document
client = Client(port=12345, protocol='http', cors=True, asyncio=True)
async for doc in client.stream_doc(
on='/hello', inputs=Document(text='hello world')
):
print(doc.text)
Note that the SSE client is language-independent. This feature also supports DocArray v2. For more information, see the streaming endpoints section of the Jina documentation.
Add env_from_secret
option to Gateway
(#5914)
We have added the env_from_secret
parameter to Gateway
to allow custom gateways to load secrets from Kubernetes when transformed to Kubernetes YAML in the same way as Executors.
from jina import Flow
f = Flow().config_gateway(env_from_secret={'SECRET_PASSWORD': {'name': 'mysecret', 'key': 'password'}}).add()
f.to_kubernetes_yaml()
🐞 Bug Fixes
Fix error working with some data types in DocArray V2 (#5905) (#5908)
We have fixed some errors when DocArray v2 documents contained flexible dictionaries, Lists, or Tensors.
Fix reloading Executor when is loaded from config.yml
(#5915)
The reload
option of an Executor was not working when the Executor was loaded from config.yml
, and Jina was not able to update Executor code after updates.
f = Flow().add(uses='config.yml', reload=True)
with f:
f.block()
Ensure closing Executor
at shutdown in Deployment
with HTTP protocol (#5906)
We have fixed a bug that prevented the close
method of the Executor
from being executed at shutdown when the Deployment
is exposed with an HTTP server.
Fix issue in mismatch endpoint when using shards (#5904)
A KeyError
was raised when working with DocArray v2 in a Deployment
using shards if the endpoints were not matching.
With this fix, it will properly call the default endpoint.
from jina import Deployment, Executor, requests
from docarray import BaseDoc, DocList
from docarray.typing import NdArray
fromt typing import List
class MyDoc(BaseDoc):
text: str
embedding: NdArray[128]
class MyDocWithMatchesAndScores(MyDoc):
matches: DocList[MyDoc]
scores: List[float]
class MyExec(Executor):
@requests
def foo(self, docs: DocList[MyDoc], **kwargs) -> DocList[MyDocWithMatchesAndScores]:
res = DocList[MyDocWithMatchesAndScores]()
for doc in docs:
new_doc = MyDocWithMatchesAndScores(text=doc.text, embedding=doc.embedding, matches=docs,
scores=[1.0 for _ in docs])
res.append(new_doc)
return res
d = Deployment(uses=MyExec, shards=2)
with d:
res = d.post(on='/', inputs=DocList[MyDoc]([MyDoc(text='hey ha', embedding=np.random.rand(128))]))
assert len(res) == 1
📗 Documentation Improvements
🤟 Contributors
We would like to thank all contributors to this release:
- Alex Cureton-Griffiths (@alexcg1 )
- AlaeddineAbdessalem (@alaeddine-13)
- Joan Fontanals (@JoanFM )
💫 Release v3.17.0
Release Note (3.17.0
)
Release time: 2023-06-06 15:25:15
This release contains 1 new feature and 1 bug fix.
🆕 Features
Flows now compatible with DocArray v2 (#5861)
Finally, Flows and Deployments are fully compatible with the new DocArray version (above 0.30.0). This includes all supported protocols and features, namely http
, grpc
and websocket
for Flow
and http
and grpc
for Deployment
.
Now you can enjoy the capacity and expressivity of the new DocArray
together with the performance, scalability and richness of Jina.
from docarray import BaseDoc, DocList
from jina import Flow, Executor, requests
class MyDoc(BaseDoc):
text: str
class MyExec(Executor):
@requests(on='/foo')
def foo(self, docs: DocList[MyDoc], **kwargs):
docs[0].text = 'hello world'
ports=[12345, 12346]
protocols=['http, 'grpc']
with Flow(protocol=protocols, ports=ports).add(uses=MyExec):
for port, protocol in zip(ports, protocols):
c = Client(port=port, protocol=protocol)
docs = c.post(on='/foo', inputs=MyDoc(text='hello'), return_type=DocList[MyDoc])
assert docs[0].text == 'hello world'
While Jina is fully compatible with the new DocArray version, for now it will not install the latest version, since there remain compatibility issues with Hubble and JCloud. As soon as these are resolved, Jina will install the new version by default.
🐞 Bug Fixes
Fix instantiation of Executor with write
decorator (#5897)
Fix instantiation of Executors where the first method is decorated with a @write
decorator.
from jina import Executor, requests
from jina.serve.executors.decorators import write
class WriteExecutor(Executor):
@write
@requests(on='/delete')
def delete(self, **kwargs):
pass
@requests(on='/bar')
@write
def bar(self, **kwargs):
pass
🤟 Contributors
We would like to thank all contributors to this release:
- Joan Fontanals (@JoanFM )
💫 Patch v3.16.1
Release Note (3.16.1
)
Release time: 2023-05-23 14:59:35
This patch contains 1 refactoring, 3 bug fixes and 3 documentation improvements.
⚙ Refactoring
Remove aiostream dependency (#5891)
Remove aiostream dependency which could be the root of improper licensing.
🐞 Bug Fixes
Fix usage of ports
and protocols
alias (#5885)
You can now use plural ports
and protocols
:
from jina import Flow
f = Flow(ports=[12345, 12346], protocols=['grpc', 'http']).add()
with f:
f.block()
Previously, the arguments would correctly be applied to the inner Deployments
.
Fix endpoint printing (#5884)
Fix endpoint printing when no ports are specified with multi-protocol Deployments:
from jina import Deployment, Executor, requests, Client
d = Deployment(protocols=['grpc', 'http'])
with d:
pass
After:
Fix docs and redocs links (#5883)
Fix the printed links to docs
and redocs
pages when HTTP
protocol is used in combination with other protocols.
from jina import Deployment, Executor, requests, Client
d = Deployment(protocol=['grpc', 'http'], port=[12345, 12346])
with d:
pass
Old behavior:
New behavior:
📗 Documentation Improvements
- Fix import in notebook causing a crash (#5888)
- Add docs for Jina Cloud logs (#5892)
- Fix YAML specs links (#5887)
🤟 Contributors
We would like to thank all contributors to this release:
- Joan Fontanals (@JoanFM)
- Han Xiao (@hanxiao )
- Alex Cureton-Griffiths (@alexcg1)
- Nikolas Pitsillos (@npitsillos )