Optimal Quantizing #592

GHBigD · 2024-08-16T21:14:57Z

GHBigD
Aug 16, 2024

Has there been any comprehensive research on if changing any of the defaults has any beneficial effect on using covert.py? I see anecdotal comments on 'this calibration length and rows is better' and 6bit head for bpw 6.0 and below 8bit head for >6.0 bpw is better. Are they in a noticeable manner? Also, is there any benefit to changing the measurement length and rows?
I make my own quants of models I like and keep them on HF repos, and I would like to know I am doing them in the best way possible. I understand the benefits of the best bpw you can squeeze onto your GPUs it's the rest I am unsure of.
I am starting to do my own testing from 8b parameter models up to 120b but damn it's time-consuming and I really don't want to. 😭

mefich · 2024-08-27T16:35:07Z

mefich
Aug 27, 2024

I see anecdotal comments on 'this calibration length and rows is better' and 6bit head for bpw 6.0 and below 8bit head for >6.0 bpw is better.

I made a few h8 quants expecting them to be better. But it bothered me that there was no info if it was actually true.
I think I checked 4bpwh6 vs 4bpwh8 or something similar and 4bpwh6 was a tiny bit better, something like below 1% better perplexity.
But it was an anecdotal experience without any decent tests and multiple tested quants since it takes ages to quantize and to do perplexity testing.

But there are sometimes even perplexity differences between exllamav2 versions, so it's hard to tell if it's even conclusive in long run.

1 reply

GHBigD Aug 29, 2024
Author

Right, the time it takes to 'requant' the same models over and over and run benchmarks makes it prohibitive. That's why I was hoping I was just missing some good data out there.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimal Quantizing #592

{{title}}

Replies: 1 comment 1 reply

{{title}}

{{title}}

Select a reply

Optimal Quantizing #592

GHBigD Aug 16, 2024

Replies: 1 comment · 1 reply

mefich Aug 27, 2024

GHBigD Aug 29, 2024 Author

GHBigD
Aug 16, 2024

Replies: 1 comment 1 reply

mefich
Aug 27, 2024

GHBigD Aug 29, 2024
Author