Skip to content

How to correctly load and merge finetuned LLaMA models in different formats? #11

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
chenmiaomiao opened this issue May 12, 2023 · 0 comments

Comments

@chenmiaomiao
Copy link

chenmiaomiao commented May 12, 2023

I am new to NLP and currently exploring the LLaMA model. I understand that there are different formats for this model - the original format and the Hugging Face format. I have fine-tuned the LLaMA model on my dataset using this tool https://github.com/lxe/llama-peft-tuner which is based on minimal-llama, and it saves the models in a certain way (see below):

$ ll llama-peft-tuner/models/csco-llama-7b-peft/
total 16456
drwxrwxr-x 8 lachlan lachlan     4096 May 10 10:42 ./
drwxrwxr-x 5 lachlan lachlan     4096 May 10 10:06 ../
drwxrwxr-x 2 lachlan lachlan     4096 May 10 10:21 checkpoint-1000/
drwxrwxr-x 2 lachlan lachlan     4096 May 10 10:28 checkpoint-1500/
drwxrwxr-x 2 lachlan lachlan     4096 May 10 10:35 checkpoint-2000/
drwxrwxr-x 2 lachlan lachlan     4096 May 10 10:42 checkpoint-2500/
drwxrwxr-x 2 lachlan lachlan     4096 May 10 10:13 checkpoint-500/
drwxrwxr-x 2 lachlan lachlan     4096 May 10 10:42 model-final/
-rw-rw-r-- 1 lachlan lachlan 16814911 May 10 10:42 params.p

$ ll llama-peft-tuner/models/csco-llama-7b-peft/checkpoint-2500/
total 7178936
drwxrwxr-x 2 lachlan lachlan       4096 May 10 10:42 ./
drwxrwxr-x 8 lachlan lachlan       4096 May 10 10:42 ../
-rw-rw-r-- 1 lachlan lachlan   33629893 May 10 10:42 optimizer.pt
-rw-rw-r-- 1 lachlan lachlan 7317523229 May 10 10:42 pytorch_model.bin
-rw-rw-r-- 1 lachlan lachlan      14575 May 10 10:42 rng_state.pth
-rw-rw-r-- 1 lachlan lachlan        557 May 10 10:42 scaler.pt
-rw-rw-r-- 1 lachlan lachlan        627 May 10 10:42 scheduler.pt
-rw-rw-r-- 1 lachlan lachlan      28855 May 10 10:42 trainer_state.json
-rw-rw-r-- 1 lachlan lachlan       3899 May 10 10:42 training_args.bin

I am not quite sure about the relationship between pytorch_model.bin, the original model, and adapter_model.bin. I suppose pytorch_model.bin is in the Hugging Face format. Now, I want to create a .pth model that I can load in https://github.com/juncongmoo/pyllama/tree/main/apps/gradio.

I followed the manual conversion guide at https://github.com/ymcui/Chinese-LLaMA-Alpaca/wiki/Manual-Conversion to convert the Hugging Face format into Hugging Face format (.bin) or PyTorch format (.pth). I tried treating pytorch_model.bin as the Hugging Face format and modified the code to ignore the LoRA, but I couldn't achieve the desired result.The fine-tuning repository mentioned below provided a way to load the trained model by combining the original model and the learned parameters. I tried to adapt this approach into https://github.com/ymcui/Chinese-LLaMA-Alpaca/blob/main/scripts/merge_llama_with_chinese_lora.py and tried different combinations, but the result either doesn't incorporate the trained parameters or generates meaningless outputs.

Can someone help me understand how to correctly load and merge these models? Any help would be greatly appreciated. Thank you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant