Minimal changes compared with llava1.5 #2

YuchenLiu98 · 2024-08-17T09:55:36Z

Thanks a lot for your wonderful work. I wonder if you could provide the required minimal changes based on official llava (1.5) code. I would appreciate it a lot if you could help me. Thanks.

YuchenLiu98 · 2024-08-17T12:54:28Z

For the pretraining stage, the training process is successful with normal loss values.
However, for the finetuning stage, the training loss is 0 at the beginning. And I also notice a mismatch warning.
"WARNING: tokenization mismatch: 119 vs. 115. (ignored)
WARNING: tokenization mismatch: 115 vs. 111. (ignored)
WARNING: tokenization mismatch: 117 vs. 113. (ignored)
WARNING: tokenization mismatch: 131 vs. 127. (ignored)
WARNING: tokenization mismatch: 129 vs. 125. (ignored)"
Do you have any idea on how to solve this problem?Thanks a lot for your help.

YuchenLiu98 · 2024-08-18T02:49:25Z

I try to modify the variable "--version" from "llama_3_1" to "llama_3". Then the problems is solved without warning and the training loss is normal. However, I would like to train llama_3_1.

federico1-creator · 2024-08-19T08:26:55Z

Hi @YuchenLiu98, thank you for your interest in our LLaVA-MORE project!

My suggestion for identifying the differences between the two codebases is to use the diff -rq command between the two repositories. This will help you see which files have been changed.

However, to make it easier for you to address your issue, the most significant changes are located in 3 different parts:

https://github.com/aimagelab/LLaVA-MORE/blob/main/llava/conversation.py
Here you can find the conversation format of ‘LLAMA_3_1’.
LLaVA-MORE/llava/train/train.py

Line 1136 in afd373a

if training_args.llm_backbone == "llama_3_1":

Selection of the correct PAD token.
LLaVA-MORE/llava/train/train.py

Line 513 in afd373a

def preprocess_llama_3_1(

In this function you apply the LLAMA_3_1 chat. The issue you're encountering during finetuning phase originates from line 596.
In fact if the check fails, the entire input gets masked.

Please, check the values of cur_len, total_len, and ensure that the correct tokenizer is instantiated.

Federico

YuchenLiu98 · 2024-08-19T12:35:12Z

Unfortunately, I fail to solve the problem. A very curious thing is that when I modify "version" to "llama3", the finetuning process continues successfully with normal value losses without the mismatch size warning. Do you have any idea about this problem? Thanks a lot for your help.

federico1-creator · 2024-08-19T17:29:30Z

The behavior you're observing is due to the fact that the preprocess functions for LLaMA 3 and LLaMA 3.1 are very similar, as is the structure of their respective tokenizers (dimension and special tokens).

Based on the logs you sent in your previous message, you can try to comment out this line.

LLaVA-MORE/llava/train/train.py

Line 592 in afd373a

cur_len= cur_len + len(tokenizer(sep, add_special_tokens=False).input_ids)

If you notice any differences, could you send us the versions of the libraries you're using, specifically tokenizers, transformers, torch, and cuda?

Federico

YuchenLiu98 · 2024-08-20T01:32:15Z

Many thanks for your help. With your mentioned modification (delete Line 592 in train.py), the warning disappears and the finetune process seems going well. In specific, I use tokenizers==0.19.1, transformers==4.43.1, torch==2.1.2, torchvision==0.16.2. I wonder will this modification influences the final performance and I will test the result once the training finishes. Thanks a lot for your help.

aoji0606 · 2024-08-20T03:14:34Z

I got the same problem, when I commented out the line and the warning disappeared.
What is the purpose of this line?
cur_len= cur_len + len(tokenizer(sep, add_special_tokens=False).input_ids)

federico1-creator · 2024-09-18T07:01:14Z

Hi, as mentioned in this issue #7
the tokenization mismatch issue might be caused by using a different version of the llama 3.1 tokenizer.

I recommend referring to that issue to fix the problem.
@aoji0606 @YuchenLiu98

federico1-creator closed this as completed Aug 19, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Minimal changes compared with llava1.5 #2

Minimal changes compared with llava1.5 #2

YuchenLiu98 commented Aug 17, 2024

YuchenLiu98 commented Aug 17, 2024

YuchenLiu98 commented Aug 18, 2024

federico1-creator commented Aug 19, 2024

YuchenLiu98 commented Aug 19, 2024

federico1-creator commented Aug 19, 2024 •

edited

Loading

YuchenLiu98 commented Aug 20, 2024

aoji0606 commented Aug 20, 2024

federico1-creator commented Sep 18, 2024

Minimal changes compared with llava1.5 #2

Minimal changes compared with llava1.5 #2

Comments

YuchenLiu98 commented Aug 17, 2024

YuchenLiu98 commented Aug 17, 2024

YuchenLiu98 commented Aug 18, 2024

federico1-creator commented Aug 19, 2024

YuchenLiu98 commented Aug 19, 2024

federico1-creator commented Aug 19, 2024 • edited Loading

YuchenLiu98 commented Aug 20, 2024

aoji0606 commented Aug 20, 2024

federico1-creator commented Sep 18, 2024

federico1-creator commented Aug 19, 2024 •

edited

Loading