Allow compressed-tensors quantized model to be trained #34520

horheynm · 2024-10-30T19:43:00Z

What does this PR do?

Using HFQuantizer, models that were quantized using compressed-tensors can be loaded.
Here, we fix the problem where the above models can also be trained.

Who can review?

@SunMarc @younesbelkada

…nsors

SunMarc

Thanks for the PR ! Left a few suggestion. Could you explain a bit more how you are performing training with compressed-tensors models if you are not using peft ? Are you maybe doing qat or just adding custom lora layers by yourself ?

src/transformers/quantizers/quantizer_compressed_tensors.py

src/transformers/utils/quantization_config.py

src/transformers/trainer.py

src/transformers/quantizers/quantizer_compressed_tensors.py

…nsors

horheynm · 2024-11-06T19:59:32Z

@SunMarc
Yes we are quantizing the model using oneshot from compressed-tensors, loading that model using AutoModelForCausalLM and HFQuantizer. Once loaded we will be running the training - qat finetuning. The 'quantization' we run is fakequant.

We are not using LoRA adapters

…nsors

…ithub.com:neuralmagic/upstream-transformers into nm-train-quantized-models-from-compressed-tensors

horheynm added 2 commits October 1, 2024 21:35

populate quantization_config for kv-cache-scheme only configs

dc19bc1

make compressed-tensors quantized models trainable

b8a2d49

horheynm changed the title ~~Nm train quantized models from compressed tensors~~ Allow compressed-tensors quantized model to be trained Oct 30, 2024

horheynm added 2 commits October 30, 2024 15:45

Merge branch 'main' into nm-train-quantized-models-from-compressed-te…

36b8899

…nsors

populate versions on quant config

a8a757a

SunMarc reviewed Nov 4, 2024

View reviewed changes

horheynm added 3 commits November 4, 2024 14:40

pass oneshot then finetune

95a36a3

remove breakpoint

4cf4fe2

SunMarc comments and fix to_dict logic

dae8b3c

horheynm marked this pull request as ready for review November 5, 2024 16:29

Merge branch 'main' into nm-train-quantized-models-from-compressed-te…

07aac0f

…nsors

horheynm marked this pull request as draft November 5, 2024 16:31

horheynm added 4 commits November 6, 2024 14:59

Merge branch 'main' into nm-train-quantized-models-from-compressed-te…

7bf07a8

…nsors

Merge branch 'main' into nm-train-quantized-models-from-compressed-te…

dfbf282

…nsors

lint

4387053

Merge branch 'nm-train-quantized-models-from-compressed-tensors' of g…

abbc62b

…ithub.com:neuralmagic/upstream-transformers into nm-train-quantized-models-from-compressed-tensors

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow compressed-tensors quantized model to be trained #34520

Allow compressed-tensors quantized model to be trained #34520

horheynm commented Oct 30, 2024

SunMarc left a comment

horheynm commented Nov 6, 2024 •

edited

Loading

Allow compressed-tensors quantized model to be trained #34520

Are you sure you want to change the base?

Allow compressed-tensors quantized model to be trained #34520

Conversation

horheynm commented Oct 30, 2024

What does this PR do?

Who can review?

SunMarc left a comment

Choose a reason for hiding this comment

horheynm commented Nov 6, 2024 • edited Loading

horheynm commented Nov 6, 2024 •

edited

Loading