-
Notifications
You must be signed in to change notification settings - Fork 654
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug] Deepseek-v2-lite AMD MI300 run failed #2384
Comments
@BruceXcluding Thanks for taking a look! |
same error when running /deepseek-coder-v2-instruct-awq on A40 |
@BruceXcluding @cxmt-ai-tc may you try to change |
it works by changed |
how to change, what file and how to compile/install |
docker image henryx/haisgl:sgl0.3.2_vllm0.6.0_torch2.5_rocm6.2_triton3.0.0 |
i change the BLOCK=128 to BLOCK=64, and got this error: root@s0pgpuap12:/workspace# CUDA_VISIBLE_DEVICES=2,3,6,7 python3 -m sglang.launch_server --model-path /nas_data/userdata/tc/models/deepseek/deepseek-coder-v2-instruct-awq/ --port 50800 --host 0.0.0.0 --tp 4 --trust-remote-code [2024-12-09 03:41:08 TP1] Load weight end. type=DeepseekV2ForCausalLM, dtype=torch.float16, avail mem=13.00 GB During handling of the above exception, another exception occurred: Traceback (most recent call last): Possible solutions:
[2024-12-09 03:41:12 TP0] Scheduler hit an exception: Traceback (most recent call last): During handling of the above exception, another exception occurred: Traceback (most recent call last): Possible solutions:
[2024-12-09 03:41:12 TP2] Scheduler hit an exception: Traceback (most recent call last): During handling of the above exception, another exception occurred: Traceback (most recent call last): Possible solutions:
[2024-12-09 03:41:12 TP3] Scheduler hit an exception: Traceback (most recent call last): During handling of the above exception, another exception occurred: Traceback (most recent call last): Possible solutions:
Killed |
@cxmt-ai-tc Can you try with this instruction #2601 |
Checklist
Describe the bug
Deepseek-v2 ROCM Env triton compiler error
Bug report:
Reproduction
Environment
docker image
henryx/haisgl:sgl0.3.2_vllm0.6.0_torch2.5_rocm6.2_triton3.0.0
The text was updated successfully, but these errors were encountered: