Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why is the inference blue low in translation task (en-de)? #5

Open
ChenYang1024 opened this issue Jun 28, 2023 · 6 comments
Open

Why is the inference blue low in translation task (en-de)? #5

ChenYang1024 opened this issue Jun 28, 2023 · 6 comments

Comments

@ChenYang1024
Copy link

Hello, I have a problem while using ParroT inference.
In the translation task of wmt22 testset (en-de), I loaded the parameters of Lrama-7b for inference, and the bleu was only 6.9808, while after loading the fine-tuning parameters of ParroT-Hint-7b-lora you provided, without adding Hint, bleu did not improve. How can I improve inference performance? Thank you!

@wxjiao
Copy link
Owner

wxjiao commented Jun 28, 2023

Hello, I have a problem while using ParroT inference. In the translation task of wmt22 testset (en-de), I loaded the parameters of Lrama-7b for inference, and the bleu was only 6.9808, while after loading the fine-tuning parameters of ParroT-Hint-7b-lora you provided, without adding Hint, bleu did not improve. How can I improve inference performance? Thank you!

Can you provide your running commands and log? Without these, I cannot locate the bugs. Thx.

@ChenYang1024
Copy link
Author

Hello, thanks for your reply! I first run inference.sh with following parameters to generate translation results:
python3 train/inference_lora.py --model-name-or-path checkpoints/llama-7b
--lora-weights checkpoints/Parrot-Hint-7b-lora/adapter_model
-lp 'en-de'
-t 0.1
-sa 'beam'
-ins test/instruct_inf.txt
-i test/WMT/newstest22.en-de. en
-o test/demo/translation_parro_demo.txt
The above llama-7b weights is hf format, and adapter_model weights are from https://huggingface.co/wxjiao/ParroT-Hint-7b-lora.
Then, I extract translation results in front of "### Instruction" from translation_parrot.txt.hyp to translation_output.txt.
Next, I run score.sh with following parameters to evaluate the above translation results:
git clone https://github.com/huggingface/transformers
cd transformers
export PAIR=en-de
export DATA_DIR=./test/WMT
export SAVE_DIR=./test/WMT
export BS=8
export NUM_BEAMS=15
mkdir -p $DATA_DIR
echo $PAIR
PYTHONPATH="src:examples/seq2seq" python examples/legacy/seq2seq/run_eval_demo.py facebook/wmt19-$PAIR $DATA_DIR/newstest22.en-de.en $SAVE_DIR/translation_output.txt --reference_path $DATA_DIR/newstest22.en-de.de --score_path $SAVE_DIR/test_bleu.json --bs $BS --task translation --num_beams $NUM_BEAMS
The above run_eval_demo.py is updated by default run_eval.py, which is only deleting translation generate function from transformers, and the bleu is 6.9808.
How can I improve inference performance? Thank you again for your help!

@wxjiao
Copy link
Owner

wxjiao commented Jul 1, 2023

Hello, thanks for your reply! I first run inference.sh with following parameters to generate translation results: python3 train/inference_lora.py --model-name-or-path checkpoints/llama-7b --lora-weights checkpoints/Parrot-Hint-7b-lora/adapter_model -lp 'en-de' -t 0.1 -sa 'beam' -ins test/instruct_inf.txt -i test/WMT/newstest22.en-de. en -o test/demo/translation_parro_demo.txt The above llama-7b weights is hf format, and adapter_model weights are from https://huggingface.co/wxjiao/ParroT-Hint-7b-lora. Then, I extract translation results in front of "### Instruction" from translation_parrot.txt.hyp to translation_output.txt. Next, I run score.sh with following parameters to evaluate the above translation results: git clone https://github.com/huggingface/transformers cd transformers export PAIR=en-de export DATA_DIR=./test/WMT export SAVE_DIR=./test/WMT export BS=8 export NUM_BEAMS=15 mkdir -p $DATA_DIR echo $PAIR PYTHONPATH="src:examples/seq2seq" python examples/legacy/seq2seq/run_eval_demo.py facebook/wmt19-$PAIR $DATA_DIR/newstest22.en-de.en $SAVE_DIR/translation_output.txt --reference_path $DATA_DIR/newstest22.en-de.de --score_path $SAVE_DIR/test_bleu.json --bs $BS --task translation --num_beams $NUM_BEAMS The above run_eval_demo.py is updated by default run_eval.py, which is only deleting translation generate function from transformers, and the bleu is 6.9808. How can I improve inference performance? Thank you again for your help!

  • Generally, translation_parrot.txt.hyp gives you the final translation results with the instruction format removed (see the function post_process() ), so no need for an addtional extraction. Did you take a look at the translation results? You may check the format even if you do not understand German text, or you can also use Google Translate to get a rough understanding of German.
  • Besides, just use Sacrebleu as follows:
cat translation_parrot.txt.hyp | sacrebleu -w 4 newstest22.en-de.de

Thanks.

@ChenYang1024
Copy link
Author

Thanks for your response, I will try it again. Your efforts in addressing my questions have truly helped me gain a deeper understanding of the workflow. I sincerely hope you will produce more solid work!

@ChenYang1024
Copy link
Author

Hello, I'm sorry to bother you again.
When I load the llama-7b parameter from https://huggingface.co/wxjiao/llama-7b, the new question is:
[2023-07-02 12:15:18,963] [INFO] [real_accelerator.py:110:get_accelerator] Setting ds_accelerator to cuda (auto detect)
The model weights are not tied. Please use the tie_weights method before using the infer_auto_device function.

Loading checkpoint shards: 0%| | 0/33 [00:00<?, ?it/s]
Loading checkpoint shards: 0%| | 0/33 [00:00<?, ?it/s]
Traceback (most recent call last):
File "/home/bingxing2/gpuuser590/yc/ParroT/transformers/src/transformers/modeling_utils.py", line 415, in load_state_dict
return torch.load(checkpoint_file, map_location="cpu")
File "/home/bingxing2/gpuuser590/.conda/envs/ParroT/lib/python3.8/site-packages/torch/serialization.py", line 795, in load
return _legacy_load(opened_file, map_location, pickle_module, **pickle_load_args)
File "/home/bingxing2/gpuuser590/.conda/envs/ParroT/lib/python3.8/site-packages/torch/serialization.py", line 1002, in _legacy_load
magic_number = pickle_module.load(f, **pickle_load_args)
_pickle.UnpicklingError: invalid load key, 'v'.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "train/inference_lora.py", line 116, in
model = AutoModelForCausalLM.from_pretrained(model_name_or_path, torch_dtype=torch.float16, device_map="auto")
File "/home/bingxing2/gpuuser590/yc/ParroT/transformers/src/transformers/models/auto/auto_factory.py", line 471, in from_pretrained
return model_class.from_pretrained(
File "/home/bingxing2/gpuuser590/yc/ParroT/transformers/src/transformers/modeling_utils.py", line 2643, in from_pretrained
) = cls._load_pretrained_model(
File "/home/bingxing2/gpuuser590/yc/ParroT/transformers/src/transformers/modeling_utils.py", line 2952, in _load_pretrained_model
state_dict = load_state_dict(shard_file)
File "/home/bingxing2/gpuuser590/yc/ParroT/transformers/src/transformers/modeling_utils.py", line 420, in load_state_dict
raise OSError(
OSError: You seem to have cloned a repository without having git-lfs installed. Please install git-lfs and run git lfs install followed by git lfs pull in the folder you cloned.
/home/bingxing2/gpuuser590/yc/ParroT/inference_LLMlora.sh: line 87: tion_parrot.txt: command not found
[2023-07-02 12:15:54,859] [INFO] [real_accelerator.py:110:get_accelerator] Setting ds_accelerator to cuda (auto detect)
The model weights are not tied. Please use the tie_weights method before using the infer_auto_device function.

Loading checkpoint shards: 0%| | 0/33 [00:00<?, ?it/s]
Loading checkpoint shards: 0%| | 0/33 [00:00<?, ?it/s]
Traceback (most recent call last):
File "/home/bingxing2/gpuuser590/yc/ParroT/transformers/src/transformers/modeling_utils.py", line 415, in load_state_dict
return torch.load(checkpoint_file, map_location="cpu")
File "/home/bingxing2/gpuuser590/.conda/envs/ParroT/lib/python3.8/site-packages/torch/serialization.py", line 795, in load
return _legacy_load(opened_file, map_location, pickle_module, **pickle_load_args)
File "/home/bingxing2/gpuuser590/.conda/envs/ParroT/lib/python3.8/site-packages/torch/serialization.py", line 1002, in _legacy_load
magic_number = pickle_module.load(f, **pickle_load_args)
_pickle.UnpicklingError: invalid load key, 'v'.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "train/inference_lora.py", line 116, in
model = AutoModelForCausalLM.from_pretrained(model_name_or_path, torch_dtype=torch.float16, device_map="auto")
File "/home/bingxing2/gpuuser590/yc/ParroT/transformers/src/transformers/models/auto/auto_factory.py", line 471, in from_pretrained
return model_class.from_pretrained(
File "/home/bingxing2/gpuuser590/yc/ParroT/transformers/src/transformers/modeling_utils.py", line 2643, in from_pretrained
) = cls._load_pretrained_model(
File "/home/bingxing2/gpuuser590/yc/ParroT/transformers/src/transformers/modeling_utils.py", line 2952, in _load_pretrained_model
state_dict = load_state_dict(shard_file)
File "/home/bingxing2/gpuuser590/yc/ParroT/transformers/src/transformers/modeling_utils.py", line 420, in load_state_dict
raise OSError(
OSError: You seem to have cloned a repository without having git-lfs installed. Please install git-lfs and run git lfs install followed by git lfs pull in the folder you cloned.

The above llama-7b weights seem to load fail.

@wxjiao
Copy link
Owner

wxjiao commented Jul 12, 2023

Hello, I'm sorry to bother you again. When I load the llama-7b parameter from https://huggingface.co/wxjiao/llama-7b, the new question is: [2023-07-02 12:15:18,963] [INFO] [real_accelerator.py:110:get_accelerator] Setting ds_accelerator to cuda (auto detect) The model weights are not tied. Please use the tie_weights method before using the infer_auto_device function.

Loading checkpoint shards: 0%| | 0/33 [00:00<?, ?it/s] Loading checkpoint shards: 0%| | 0/33 [00:00<?, ?it/s] Traceback (most recent call last): File "/home/bingxing2/gpuuser590/yc/ParroT/transformers/src/transformers/modeling_utils.py", line 415, in load_state_dict return torch.load(checkpoint_file, map_location="cpu") File "/home/bingxing2/gpuuser590/.conda/envs/ParroT/lib/python3.8/site-packages/torch/serialization.py", line 795, in load return _legacy_load(opened_file, map_location, pickle_module, **pickle_load_args) File "/home/bingxing2/gpuuser590/.conda/envs/ParroT/lib/python3.8/site-packages/torch/serialization.py", line 1002, in _legacy_load magic_number = pickle_module.load(f, **pickle_load_args) _pickle.UnpicklingError: invalid load key, 'v'.

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "train/inference_lora.py", line 116, in model = AutoModelForCausalLM.from_pretrained(model_name_or_path, torch_dtype=torch.float16, device_map="auto") File "/home/bingxing2/gpuuser590/yc/ParroT/transformers/src/transformers/models/auto/auto_factory.py", line 471, in from_pretrained return model_class.from_pretrained( File "/home/bingxing2/gpuuser590/yc/ParroT/transformers/src/transformers/modeling_utils.py", line 2643, in from_pretrained ) = cls._load_pretrained_model( File "/home/bingxing2/gpuuser590/yc/ParroT/transformers/src/transformers/modeling_utils.py", line 2952, in _load_pretrained_model state_dict = load_state_dict(shard_file) File "/home/bingxing2/gpuuser590/yc/ParroT/transformers/src/transformers/modeling_utils.py", line 420, in load_state_dict raise OSError( OSError: You seem to have cloned a repository without having git-lfs installed. Please install git-lfs and run git lfs install followed by git lfs pull in the folder you cloned. /home/bingxing2/gpuuser590/yc/ParroT/inference_LLMlora.sh: line 87: tion_parrot.txt: command not found [2023-07-02 12:15:54,859] [INFO] [real_accelerator.py:110:get_accelerator] Setting ds_accelerator to cuda (auto detect) The model weights are not tied. Please use the tie_weights method before using the infer_auto_device function.

Loading checkpoint shards: 0%| | 0/33 [00:00<?, ?it/s] Loading checkpoint shards: 0%| | 0/33 [00:00<?, ?it/s] Traceback (most recent call last): File "/home/bingxing2/gpuuser590/yc/ParroT/transformers/src/transformers/modeling_utils.py", line 415, in load_state_dict return torch.load(checkpoint_file, map_location="cpu") File "/home/bingxing2/gpuuser590/.conda/envs/ParroT/lib/python3.8/site-packages/torch/serialization.py", line 795, in load return _legacy_load(opened_file, map_location, pickle_module, **pickle_load_args) File "/home/bingxing2/gpuuser590/.conda/envs/ParroT/lib/python3.8/site-packages/torch/serialization.py", line 1002, in _legacy_load magic_number = pickle_module.load(f, **pickle_load_args) _pickle.UnpicklingError: invalid load key, 'v'.

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "train/inference_lora.py", line 116, in model = AutoModelForCausalLM.from_pretrained(model_name_or_path, torch_dtype=torch.float16, device_map="auto") File "/home/bingxing2/gpuuser590/yc/ParroT/transformers/src/transformers/models/auto/auto_factory.py", line 471, in from_pretrained return model_class.from_pretrained( File "/home/bingxing2/gpuuser590/yc/ParroT/transformers/src/transformers/modeling_utils.py", line 2643, in from_pretrained ) = cls._load_pretrained_model( File "/home/bingxing2/gpuuser590/yc/ParroT/transformers/src/transformers/modeling_utils.py", line 2952, in _load_pretrained_model state_dict = load_state_dict(shard_file) File "/home/bingxing2/gpuuser590/yc/ParroT/transformers/src/transformers/modeling_utils.py", line 420, in load_state_dict raise OSError( OSError: You seem to have cloned a repository without having git-lfs installed. Please install git-lfs and run git lfs install followed by git lfs pull in the folder you cloned.

The above llama-7b weights seem to load fail.

Seems that the llama model weights have not been downloaded correctly. Can you take a look at the weights you downloaded and those in https://huggingface.co/wxjiao/llama-7b. I'd recommend to download the weights directly using wget with the url of checkpoints. Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants