How to correctly load and merge finetuned LLaMA models in different formats? #11

chenmiaomiao · 2023-05-12T10:04:43Z

I am new to NLP and currently exploring the LLaMA model. I understand that there are different formats for this model - the original format and the Hugging Face format. I have fine-tuned the LLaMA model on my dataset using this tool https://github.com/lxe/llama-peft-tuner which is based on minimal-llama, and it saves the models in a certain way (see below):

$ ll llama-peft-tuner/models/csco-llama-7b-peft/
total 16456
drwxrwxr-x 8 lachlan lachlan     4096 May 10 10:42 ./
drwxrwxr-x 5 lachlan lachlan     4096 May 10 10:06 ../
drwxrwxr-x 2 lachlan lachlan     4096 May 10 10:21 checkpoint-1000/
drwxrwxr-x 2 lachlan lachlan     4096 May 10 10:28 checkpoint-1500/
drwxrwxr-x 2 lachlan lachlan     4096 May 10 10:35 checkpoint-2000/
drwxrwxr-x 2 lachlan lachlan     4096 May 10 10:42 checkpoint-2500/
drwxrwxr-x 2 lachlan lachlan     4096 May 10 10:13 checkpoint-500/
drwxrwxr-x 2 lachlan lachlan     4096 May 10 10:42 model-final/
-rw-rw-r-- 1 lachlan lachlan 16814911 May 10 10:42 params.p

$ ll llama-peft-tuner/models/csco-llama-7b-peft/checkpoint-2500/
total 7178936
drwxrwxr-x 2 lachlan lachlan       4096 May 10 10:42 ./
drwxrwxr-x 8 lachlan lachlan       4096 May 10 10:42 ../
-rw-rw-r-- 1 lachlan lachlan   33629893 May 10 10:42 optimizer.pt
-rw-rw-r-- 1 lachlan lachlan 7317523229 May 10 10:42 pytorch_model.bin
-rw-rw-r-- 1 lachlan lachlan      14575 May 10 10:42 rng_state.pth
-rw-rw-r-- 1 lachlan lachlan        557 May 10 10:42 scaler.pt
-rw-rw-r-- 1 lachlan lachlan        627 May 10 10:42 scheduler.pt
-rw-rw-r-- 1 lachlan lachlan      28855 May 10 10:42 trainer_state.json
-rw-rw-r-- 1 lachlan lachlan       3899 May 10 10:42 training_args.bin

I am not quite sure about the relationship between pytorch_model.bin, the original model, and adapter_model.bin. I suppose pytorch_model.bin is in the Hugging Face format. Now, I want to create a .pth model that I can load in https://github.com/juncongmoo/pyllama/tree/main/apps/gradio.

I followed the manual conversion guide at https://github.com/ymcui/Chinese-LLaMA-Alpaca/wiki/Manual-Conversion to convert the Hugging Face format into Hugging Face format (.bin) or PyTorch format (.pth). I tried treating pytorch_model.bin as the Hugging Face format and modified the code to ignore the LoRA, but I couldn't achieve the desired result.The fine-tuning repository mentioned below provided a way to load the trained model by combining the original model and the learned parameters. I tried to adapt this approach into https://github.com/ymcui/Chinese-LLaMA-Alpaca/blob/main/scripts/merge_llama_with_chinese_lora.py and tried different combinations, but the result either doesn't incorporate the trained parameters or generates meaningless outputs.

Can someone help me understand how to correctly load and merge these models? Any help would be greatly appreciated. Thank you.

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

How to correctly load and merge finetuned LLaMA models in different formats? #11

How to correctly load and merge finetuned LLaMA models in different formats? #11

chenmiaomiao commented May 12, 2023 •

edited

Loading

How to correctly load and merge finetuned LLaMA models in different formats? #11

How to correctly load and merge finetuned LLaMA models in different formats? #11

Comments

chenmiaomiao commented May 12, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

chenmiaomiao commented May 12, 2023 •

edited

Loading