Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow compressed-tensors quantized model to be trained #34520

Draft
wants to merge 12 commits into
base: main
Choose a base branch
from

Conversation

horheynm
Copy link
Contributor

What does this PR do?

Using HFQuantizer, models that were quantized using compressed-tensors can be loaded.
Here, we fix the problem where the above models can also be trained.

Who can review?

@SunMarc @younesbelkada

@horheynm horheynm changed the title Nm train quantized models from compressed tensors Allow compressed-tensors quantized model to be trained Oct 30, 2024
Copy link
Member

@SunMarc SunMarc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR ! Left a few suggestion. Could you explain a bit more how you are performing training with compressed-tensors models if you are not using peft ? Are you maybe doing qat or just adding custom lora layers by yourself ?

src/transformers/utils/quantization_config.py Outdated Show resolved Hide resolved
src/transformers/trainer.py Outdated Show resolved Hide resolved
@horheynm horheynm marked this pull request as ready for review November 5, 2024 16:29
@horheynm horheynm marked this pull request as draft November 5, 2024 16:31
@horheynm
Copy link
Contributor Author

horheynm commented Nov 6, 2024

@SunMarc
Yes we are quantizing the model using oneshot from compressed-tensors, loading that model using AutoModelForCausalLM and HFQuantizer. Once loaded we will be running the training - qat finetuning. The 'quantization' we run is fakequant.

We are not using LoRA adapters

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants