-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: SentenceTransformersDocumentEmbedder and SentenceTransformersTextEmbedder can accept and pass any arguments to SentenceTransformer.encode #8806
Conversation
…xtEmbedder can accept and pass any arguments to SentenceTransformer.encode
Pull Request Test Coverage Report for Build 13161311307Details
💛 - Coveralls |
Hello and thanks for the PR! While I understand the need of passing So I need to think a bit about covering the use case you proposed... I'll let you know! |
Thanks for the feedback! Looking forward to hearing back from you. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general, this approach is OK.
I've left some comments.
Plus,
- let's test that
encode_kwargs
are passed to the embedding backend using mocks (this test can be an inspiration:haystack/test/components/embedders/test_sentence_transformers_document_embedder.py
Line 298 in 5ae9488
def test_embed_metadata(self): - merge main from your branch: I recently merged a PR (fix: fix test failures with Transformers models in PRs from forks #8809) that should make tests pass
haystack/components/embedders/sentence_transformers_document_embedder.py
Outdated
Show resolved
Hide resolved
haystack/components/embedders/sentence_transformers_document_embedder.py
Outdated
Show resolved
Hide resolved
haystack/components/embedders/sentence_transformers_text_embedder.py
Outdated
Show resolved
Hide resolved
haystack/components/embedders/sentence_transformers_text_embedder.py
Outdated
Show resolved
Hide resolved
…or-sentence-transformers
…dder and SentenceTransformersTextEmbedder mae to be the last positional parameter for backward compatibility reasons
…Embedder and SentenceTransformersDocumentEmbedder
…er and SentenceTransformersDocumentEmbedder
@anakin87 Thanks for the feedback! Went through all of them, and fixed my PR accordingly. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
some final things to fix
haystack/components/embedders/sentence_transformers_document_embedder.py
Outdated
Show resolved
Hide resolved
haystack/components/embedders/sentence_transformers_text_embedder.py
Outdated
Show resolved
Hide resolved
haystack/components/embedders/sentence_transformers_text_embedder.py
Outdated
Show resolved
Hide resolved
…mbedder and SentenceTransformersDocumentEmbedder
…dder and SentenceTransformersTextEmbedder mae to be the last positional parameter for backward compatibility (part II.)
Sorry for my oversight, now should be good. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice.
Thanks for your contribution! 💙
Enhanced
SentenceTransformersDocumentEmbedder
andSentenceTransformersTextEmbedder
to accept an additional parameter, which is passed directly to the underlyingSentenceTransformer.encode
method for greater flexibility in embedding customization.Related Issues
No related issues
Proposed Changes:
Recent embedding methods requires extra parameters when used with
SentenceTransformer
(e.g.task
,prompt
,prompt_name
). One specific example is jina-embeddings-v3 which supports task-specific embeddings through thetask
paramater.How did you test it?
Extended corresponding unit tests and make them pass.
Checklist
fix:
,feat:
,build:
,chore:
,ci:
,docs:
,style:
,refactor:
,perf:
,test:
and added!
in case the PR includes breaking changes.