Support for quantized embedding models #8779
Replies: 3 comments 2 replies
-
Hello! I'm not sure that Sentence Transformers supports embedding models quantized with I did a quick search and did not find anything. Could you share some pointers/resources? |
Beta Was this translation helpful? Give feedback.
-
Actually, I was trying to load a from haystack.components.embedders import SentenceTransformersDocumentEmbedder
doc_embedder = SentenceTransformersDocumentEmbedder(model="<path-to-quantized-model>")
doc_embedder.warm_up() So, I was just wondering if there's any other method, implemented in haystack, to load quantized models directly. |
Beta Was this translation helpful? Give feedback.
-
Thanks @anakin87 , we found a way. |
Beta Was this translation helpful? Give feedback.
-
Can we use BitsAndBytes quantized encoder models in the
SentenceTransformersDocumentEmbedder
? I went through the code but didn't find any implementation for handling quantized models.Beta Was this translation helpful? Give feedback.
All reactions