Scalar Quantization memery cost estimation #4803

xtyDoge · 2024-08-02T09:41:42Z

xtyDoge
Aug 2, 2024

I have a 3 nodes qdrant cluster, and a collection with 16million vectors, and it has 2 replicas, the vector config shows below

        "vectors": {
          "size": 512,
          "distance": "Dot",
          "on_disk": true
        },

I want to apply Scalar Quantization on this collection to reduce disk loading when querying this collection. How can I estimate memory cost on each node?
The result I get is 16,000,000 * 512 * 4B(float64) / 4(Quantization compress factor), approxiamtely total 8GB, on each node is 8GB * 2 /3 = 5.3G memory usage, is that right?

timvisee · 2024-08-07T12:15:08Z

timvisee
Aug 7, 2024
Maintainer

Roughly, yes. Note that its float32, which is indeed 4 bytes.

That would just be the quantized vectors though. It does not include other memory consumers, such as payload data and a bit of overhead for the collection itself.

As always, I'd recommend you to test in practice whether your above estimation holds true.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Qdrant

Scalar Quantization memery cost estimation #4803

{{title}}

Replies: 1 comment

{{title}}

Select a reply

Qdrant

Scalar Quantization memery cost estimation #4803

xtyDoge Aug 2, 2024

Replies: 1 comment

timvisee Aug 7, 2024 Maintainer

xtyDoge
Aug 2, 2024

timvisee
Aug 7, 2024
Maintainer