-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Minimal changes compared with llava1.5 #2
Comments
For the pretraining stage, the training process is successful with normal loss values. |
I try to modify the variable "--version" from "llama_3_1" to "llama_3". Then the problems is solved without warning and the training loss is normal. However, I would like to train llama_3_1. |
Hi @YuchenLiu98, thank you for your interest in our LLaVA-MORE project! My suggestion for identifying the differences between the two codebases is to use the However, to make it easier for you to address your issue, the most significant changes are located in 3 different parts:
Please, check the values of Federico |
Unfortunately, I fail to solve the problem. A very curious thing is that when I modify "version" to "llama3", the finetuning process continues successfully with normal value losses without the mismatch size warning. Do you have any idea about this problem? Thanks a lot for your help. |
The behavior you're observing is due to the fact that the preprocess functions for LLaMA 3 and LLaMA 3.1 are very similar, as is the structure of their respective tokenizers (dimension and special tokens). Based on the logs you sent in your previous message, you can try to comment out this line. LLaVA-MORE/llava/train/train.py Line 592 in afd373a
If you notice any differences, could you send us the versions of the libraries you're using, specifically Federico |
Many thanks for your help. With your mentioned modification (delete Line 592 in train.py), the warning disappears and the finetune process seems going well. In specific, I use tokenizers==0.19.1, transformers==4.43.1, torch==2.1.2, torchvision==0.16.2. I wonder will this modification influences the final performance and I will test the result once the training finishes. Thanks a lot for your help. |
I got the same problem, when I commented out the line and the warning disappeared. |
Hi, as mentioned in this issue #7 I recommend referring to that issue to fix the problem. |
Thanks a lot for your wonderful work. I wonder if you could provide the required minimal changes based on official llava (1.5) code. I would appreciate it a lot if you could help me. Thanks.
The text was updated successfully, but these errors were encountered: