-
Notifications
You must be signed in to change notification settings - Fork 330
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Nan loss in RCAN model #12
Comments
Hi! Thanks for sharing the failure case! |
I find out if The resource depan img patch_size, set args.n_resgroups = 3 and args.n_resblocks = 2 will much faster and less VRAM. |
Thanks for more details. In this case, I guess AdaBound is a little bit sensitive on RCAN model, and a |
I try |
1e-4 might be too small ... If I understand correctly, the only difference between |
Not exactly correct, suppose If dataset A has 101 data, and the batchsize is set as 10. |
I believe that's a very extreme case. Generally, a single step won't affect the whole training process, on expectation. In this case, we would encounter a much less gradient once an epoch when using |
hi, i use torch version 0.3.1. and just I modified when I ran it I faced raise ImportError("torch.utils.ffi is deprecated). Would you help? |
hi, I‘m a beginner, and I have a small question about it: |
https://github.com/wayne391/Image-Super-Resolution/blob/master/src/models/RCAN.py
Just change
optimizer = torch.optim.Adam(model.parameters(), lr=1e-4, amsgrad=False)
to
optimizer = adabound.AdaBound(model.parameters(), lr=1e-4, final_lr=0.1)
Nan loss in RCAN model, but Adam work fine.
The text was updated successfully, but these errors were encountered: