Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Perplexity is off for Llama 2-7b #47

Open
taratt opened this issue May 6, 2024 · 4 comments
Open

Perplexity is off for Llama 2-7b #47

taratt opened this issue May 6, 2024 · 4 comments

Comments

@taratt
Copy link

taratt commented May 6, 2024

Hello,
I hope this finds you well.

I was trying to prune Llama 2-7b with wanda (cloned directly from your codebase), so I ran the following command:
python main.py --model meta-llama/Llama-2-7b-hf --prune_method wanda --sparsity_ratio 0.5 --sparsity_type unstructured --save out/llama2_7b/unstructured/wanda/

but I get a perplexity of 10.27 which is way higher than what you guys are reporting. It is being pruned with c4 and tested on wikitext2 (I changed nothing in the codebase). Do you guys maybe have a guess on what I might be doing wrong?

TIA

@Eric-mingjie
Copy link
Collaborator

Hi, Could you check if the performance of dense LLaMA2-7b match our number in the paper as well?

@taratt
Copy link
Author

taratt commented May 7, 2024

Hi,
Thanks for your prompt response.
Yes, the dense is off too. I'm getting 7.72 for LLaMA2-7b and you guys are reporting 5.12.
Can you maybe clone your repository again yourself and see if you can reproduce the results?

@Eric-mingjie
Copy link
Collaborator

hmm, the number I get from reruning is still 5.12 for dense LLaMA2 (context size 4096), even with context size 2024, the number would be around 5.5, as verified by other works (e.g., table 4 in https://arxiv.org/abs/2306.00978).

@taratt
Copy link
Author

taratt commented May 7, 2024

I'm running with context size 4096 as well (nsampels = 333). This is so weird. What version of datasets and transformers are you using?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants