fix(HuggingFaceLocalGenerator): remove stop_words cross-product in reply post-processing#11502
Closed
alvinttang wants to merge 1 commit into
Closed
Conversation
|
Someone is attempting to deploy a commit to the deepset Team on Vercel. A member of the Team first needs to authorize it. |
anakin87
requested changes
Jun 4, 2026
anakin87
left a comment
Member
There was a problem hiding this comment.
Thank you for this PR.
Please sign the CLA, then ping me and I'll proceed with the actual review.
9fc7c40 to
7697c9b
Compare
Author
|
recheck |
…ply post-processing With N replies and M stop_words, the previous nested-comprehension produced N*M replies instead of N. Half of the extra replies still contained the stop word because each iteration only stripped one. Switching to a sequential loop (already what the chat sibling at chat/hugging_face_local.py:660 does) keeps the count at N and removes every stop word from every reply. Refs deepset-ai#11409
e971f30 to
6ee9af3
Compare
Author
|
@anakin87 CLA is signed now (had to re-author the commit under the right email to make cla-assistant pick it up). Ready for review whenever you have a moment, thanks. |
Contributor
|
Thanks @alvinttang, the bug has been already fixed in #11413 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Refs #11409.
HuggingFaceLocalGenerator.runpost-processes replies with a nested list comprehension:That's a cross-product. With N replies and M stop words it emits N*M replies, and only every M-th one has every stop word removed. Half the output silently still contains a stop word.
The chat sibling at
chat/hugging_face_local.py:660already does this correctly with a sequential loop, so this PR aligns the non-chat path with the same pattern.RED
Two new regression tests on
main:The original
test_run_stop_words_removal(single stop word) keeps passing.GREEN
Full file:
28 passed, 1 deselected (integration, model download), 6 warnings in 140.60s. No regression elsewhere.