-
Notifications
You must be signed in to change notification settings - Fork 60
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Evaluate on long context (32k,64k etc..) on 30B/70B large models #48
Comments
You are right. We will update soon to also support 30B/70B models with accelerate/deepspeed. |
waiting |
how todo it,can you share your codes? |
You may need some modification to the code, I removed model2path and some other stuff. World_size and args.s are useless in this version, you can remove them
|
Hi, @CaesarWWK Thanks for your reply! @lvjianxin An easy way (without modifying much to the current codebase) might be to add |
Hi,
I found that the original script cannot handle large models on long context effectively, since it use multiprocess to load an entire model on a single gpu.
I also tried different methods to add support 30B/70B models such as deepspeed-inference ,accelerate, vllm. Finally vllm can support benchmark on large models with long context (34B with 32k context with a 8*a800 node in my case) and it requires minimum changes to the original code.
I hope this information can help people who also want to evaluate on large models
The text was updated successfully, but these errors were encountered: