-
Notifications
You must be signed in to change notification settings - Fork 34
Add support to automatic "hybrid" streaming #182
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…omponents assigning proper streaming_callback type
…ync method signature rather than components attributes)
|
@sjrl I know it's a bit off-topic, but on this PR I've also improved the check for streaming capable components. Indeed, before this we were still looking for So now the behaviour is the following:
This will make stream to work also with components like FallbackChatGenerator which doesn't have |
Co-authored-by: Sebastian Husch Lee <[email protected]>
Co-authored-by: Sebastian Husch Lee <[email protected]>
Co-authored-by: Sebastian Husch Lee <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good!
Ideally, an AsyncPipeline should have only async-compatible components with async
streaming_callbacksupport.However, there are cases where a pipeline might include a mix of components with both sync-only and async streaming support.
For example, you might have a pipeline that includes an older generator such as
HuggingFaceLocalGenerator(or any other component lacking async streaming support) alongside an async-compatible one likeOpenAIChatGenerator.With this, I am introducing a parameter to
async_streaming_generatorcalledallow_sync_streaming_callbackswhich accepts the following values:False(default): normal behaviour (strict) - accepts only async-streaming-compatible components onAsyncPipeline"auto": Automatically detect and enable hybrid mode if needed - Hybrid mode allows components with sync-only streaming callback support to work in async pipelines.Using "auto" mode will allow async streaming to work with (old) generators like
HuggingFaceLocalGenerator.Running synchronous callbacks within an async streaming context is still not recommended, as it introduces minor overhead - though typically negligible (~1–2 μs per call).