Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using the humanevalpack to test the ChatGLM3 model results in an abnormal score. #251

Open
burger-pb opened this issue Jul 5, 2024 · 0 comments

Comments

@burger-pb
Copy link

burger-pb commented Jul 5, 2024

Hi
When I tried to test the ChatGLM3 model using the humanevalfixdocs-python task in humanevalpack, an abnormality occurred with a score of 0. The command used is as follows.

accelerate launch main.py
--model THUDM/chatglm3-6b
--left_padding
--tasks humanevalfixdocs-python
--max_length_generation 2048
--prompt chatglm3
--trust_remote_code
--temperature 0.7
--do_sample True
--n_samples 1
--batch_size 64
--precision bf16
--allow_code_execution
--save_generations

The prompt I used is as follows.
elif self.prompt == "chatglm3":
prompt = f"<|user|>{inp}<|assistant|>{prompt_base}"
else:
raise ValueError(f"The --prompt argument {self.prompt} wasn't provided or isn't supported")

The issue I encountered is that the model did not generate any answers in the generations.
The answer to the first question is as follows, and it can be seen that no response from the model was obtained.

from typing import List\n\n\ndef has_close_elements(numbers: List[float], threshold: float) -> bool:\n """ Check if in given list of numbers, are any two numbers closer to each other than\n given threshold.\n >>> has_close_elements([1.0, 2.0, 3.0], 0.5)\n False\n >>> has_close_elements([1.0, 2.8, 3.0, 4.0, 5.0, 2.0], 0.3)\n True\n """

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant