-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AttributeError: module 'torch.distributed' has no attribute '_all_gather_base' #1532
Comments
that pytorch is a bit too old for the current master branch of this repo. |
oh, thank you |
hi i have the same problem. and the cuda version is 10.2 |
I use some of older branches e.g. 22.04-dev could work for my environment. It looks ok. But I don't test anthoer code. You can have try |
Yeap, I got it. |
|
您好,我想请问一下,pytorch的版本应该更改为多少的时候,是可以不出现这个报错吗,谢谢 |
您好,我pytorch版本和cuda版本跟您是一样的,我想请问一下您现在解决了这个问题吗 |
select 22.04-dev as the cloned rep. |
Now the apex is the master. so please check out the branck to 22.04-dev and then git clone... |
Okay, I got it. |
I tried to some older version but I got another error "Expected object of scalar type Long but got scalar type Int for argument #2 'target' in call to _thnn_nll_loss_forward". Has anyone seen this error? Thank you. Python 3.8.10 |
Does using 22.04-dev resolves your error? |
|
MILVLG/bottom-up-attention.pytorch#98 (comment) It works on me. :) |
The latest version of apex currently does not install, as mentioned here facebookresearch#52. This issue with apex has also been reported here NVIDIA/apex#1679 huggingface/transformers#24351 suggests pinning apex to a specific commit, `cd apex && git checkout 82ee367f3da74b4cd62a1fb47aa9806f0f47b58b`, after which apex installs successfully. However, that version of apex is incompatible with the version of torch used here, and I get this error NVIDIA/apex#1532. The previous link suggest using version `22.04-dev` (`cd apex && git checkout 22.04-dev`) of apex. With this, apex compiles successfully and `python ./main_finetune.py` also runs training using amp successfully. If the authors can tell us the exact HEAD commit of apex version that they used, we can use that version instead!
my version is
The text was updated successfully, but these errors were encountered: