Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Batched inferencing using HuggingfaceLocalGenerator #8770

Open
srsingh24 opened this issue Jan 24, 2025 · 4 comments
Open

Batched inferencing using HuggingfaceLocalGenerator #8770

srsingh24 opened this issue Jan 24, 2025 · 4 comments
Labels
P3 Low priority, leave it in the backlog type:feature New feature or request

Comments

@srsingh24
Copy link

Is there a way to perform batched inferencing using HuggingfaceLocalGenerator? I could not find any information about this on the docs.

@anakin87
Copy link
Member

Hello!

While it should be possible to configure the batch_size of the Hugging Face pipeline/model under the hood,
this component only accepts a prompt (single str) as input, therefore this practically does not allow batching.

Could you tell me more about your use case?

@anakin87 anakin87 added the type:feature New feature or request label Jan 24, 2025
@srsingh24
Copy link
Author

srsingh24 commented Jan 24, 2025

@anakin87 Since the HuggingFaceLocalGenerator only accepts a single str input, I am having to sequentially loop through each prompt. This makes running my code very slow. I would like to perform batched inference so I can make better use of my GPUs and run the inference faster

@anakin87
Copy link
Member

Clear... I wanted to better understand what you are building...
Are you running evaluation? Performing RAG? ...

@srsingh24
Copy link
Author

I am using the HuggingfaceLocalGenerator for all sorts of use cases from basic inferencing on a list of prompts to evaluation to RAG

@julian-risch julian-risch added the P3 Low priority, leave it in the backlog label Jan 31, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
P3 Low priority, leave it in the backlog type:feature New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants