-
Notifications
You must be signed in to change notification settings - Fork 160
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Some weights of the model checkpoint at `finetune_starcoder2/final_checkpoint were not used when initializing Starcoder2ForCausalLM #7
Comments
I'm having the same problem. Is this the correct way to load the fine-tuned model? Is there no need to merge the lora adapter? |
The expected output of the fine tuning process should be the peft adapter weights, instead of the whole model. Once the training process is finished, you'll see that under the final_checkpoint there are the safetensors of the model, but if you check the size is similar to the original model, while the adapter should be in the order of MB. I guess something is broken with one of the libraries. If you check the object type you'll notice the problem:
The output for the model type is:
I've temporarily fixed this issue by changing the next line (although I'm not sure if it's totally correct): Then, I can use the merge_peft_adapters.py from StarCoder's repository and do inference. |
Hey, nice work! How’s the performance? I am wondering if it’s even worth to do this. |
In my case is a must, since I'm doing Instruction Fine-tuning and the performance of the fine-tuned model is good as expected. |
I get the following error after finetuning this model on the R dataset following the example in the README.
Also, I don't think doing 4-bit quantization as a default for finetuning is a good idea. It should be opt-in with a flag.
I am also wondering why do we use the Stack v1 for finetuning and not the Stack v2?
The text was updated successfully, but these errors were encountered: