-
Notifications
You must be signed in to change notification settings - Fork 2.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reproduction issue of task GSM8K with Llama3.2-1B-Instruct #810
Comments
@VoiceBeer Can you show me your |
I noticed that this |
@wukaixingxp Hi, sorry for the delay! The URL is https://github.com/EleutherAI/lm-evaluation-harness/blob/main/lm_eval/tasks/gsm8k/gsm8k-cot-llama.yaml, and it does come from the lm-eval repo. I've noticed that the llama-recipe repo does not currently contain a gsm8k config. So is there any way I can reproduce the reported gsm8k results? |
Hi, any further info :>, still stuck on how to evaluate models on gsm8k, the .yaml file is here:
|
Hi thx for the work and new llama3.2 reproduction update. But I do encounter an issue with gsm8k reproduction.
I add gsm8k directory to my work_dir manually with .yaml file here,
and the command was
CUDA_VISIBLE_DEVICES=3 lm_eval --model hf --model_args pretrained=/data/models/meta-llama/Llama-3.2-1B-Instruct,dtype=auto,parallelize=False,add_bos_token=True --tasks meta_gsm8k --batch_size 4 --output_path eval_results_general --include_path llama32_1B_workdir --seed 42 --log_samples --fewshot_as_multiturn --apply_chat_template
The result I got is 0.4003, which is different than the official report of 44.4.
Is there any thing I missed? Thx
The text was updated successfully, but these errors were encountered: