Support for quantized embedding models #8779

d1pankarmedhi · 2025-01-28T10:05:35Z

d1pankarmedhi
Jan 28, 2025

Can we use BitsAndBytes quantized encoder models in the SentenceTransformersDocumentEmbedder? I went through the code but didn't find any implementation for handling quantized models.

anakin87 · 2025-01-28T13:52:51Z

anakin87
Jan 28, 2025
Maintainer

Hello!

I'm not sure that Sentence Transformers supports embedding models quantized with bitsandbytes.

I did a quick search and did not find anything.

Could you share some pointers/resources?

0 replies

d1pankarmedhi · 2025-01-28T16:18:20Z

d1pankarmedhi
Jan 28, 2025
Author

Actually, I was trying to load a bitsandbytes quantized sentence transformer model, saved on my local, using SentenceTransformersDocumentEmbedder.

from haystack.components.embedders import SentenceTransformersDocumentEmbedder

doc_embedder = SentenceTransformersDocumentEmbedder(model="<path-to-quantized-model>")
doc_embedder.warm_up()

So, I was just wondering if there's any other method, implemented in haystack, to load quantized models directly.

2 replies

d1pankarmedhi Jan 28, 2025
Author

One of our projects use haystack 1.x. We have a hybrid pipeline and we saw a significant improvement with a quantized cross encoder model without loosing much accuracy.

from haystack.nodes import JoinDocuments, SentenceTransformersRanker
join_documents = JoinDocuments(join_mode="concatenate")
rerank = SentenceTransformersRanker(model_name_or_path="cross-encoder/ms-marco-MiniLM-L-6-v2")

So, I was just trying to find whether it is possible to quantize these models and directly load them using haystack without making any major changes to the existing code.

anakin87 Jan 28, 2025
Maintainer

Are you interested in Embedders or Rankers?

For Embedders based on Sentence Transformers, I don't think there is an easy solution, because Sentence Transformers do not support this.
For Rankers, for example the TransformersSimilarityRanker is based on Transformers (not Sentence Transformers) and this may be feasible.

d1pankarmedhi · 2025-01-30T04:05:02Z

d1pankarmedhi
Jan 30, 2025
Author

Thanks @anakin87 , we found a way.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support for quantized embedding models #8779

{{title}}

Replies: 3 comments 2 replies

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Select a reply

Support for quantized embedding models #8779

d1pankarmedhi Jan 28, 2025

Replies: 3 comments · 2 replies

anakin87 Jan 28, 2025 Maintainer

d1pankarmedhi Jan 28, 2025 Author

d1pankarmedhi Jan 28, 2025 Author

anakin87 Jan 28, 2025 Maintainer

d1pankarmedhi Jan 30, 2025 Author

d1pankarmedhi
Jan 28, 2025

Replies: 3 comments 2 replies

anakin87
Jan 28, 2025
Maintainer

d1pankarmedhi
Jan 28, 2025
Author

d1pankarmedhi Jan 28, 2025
Author

anakin87 Jan 28, 2025
Maintainer

d1pankarmedhi
Jan 30, 2025
Author