Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RuntimeError: CUDA error: invalid device ordinal #27

Closed
uestc-huangyw opened this issue Jul 4, 2024 · 3 comments
Closed

RuntimeError: CUDA error: invalid device ordinal #27

uestc-huangyw opened this issue Jul 4, 2024 · 3 comments

Comments

@uestc-huangyw
Copy link

单机单卡训练,遇到如下错误

[rank5]: RuntimeError: CUDA error: invalid device ordinal
[rank5]: CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
[rank5]: For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
[rank5]: Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.
@NLPJCL
Copy link
Owner

NLPJCL commented Jul 4, 2024

请提供下配置文件default_fsdp.yaml的详情,以及执行参数。

@uestc-huangyw
Copy link
Author

感谢您的回复,可以顺利运行了,感谢您提供的统一微调方式

@NLPJCL
Copy link
Owner

NLPJCL commented Jul 4, 2024

很开心能帮助到你~

@NLPJCL NLPJCL closed this as completed Jul 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants