-
Notifications
You must be signed in to change notification settings - Fork 611
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RuntimeError: CUDA out of memory. #144
Comments
I was able to train my network by using the CPU instead of the GPU. It took a lot longer but at least it got the job done. |
hey @prashanth31, I was wondering, how did you get it to run on the GPU? what's the command I should use? |
I ended up running the cpu only version. Takes a lot of time but at least
works.
…On Wed, Apr 7, 2021, 3:28 PM Ankur Mahto ***@***.***> wrote:
hey @prashanth31 <https://github.com/prashanth31>, I was wondering, how
did you get it to run on the GPU? what's the command I should use?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#144 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AA333D5YRJ2ICQYPKQP4L7LTHSW4VANCNFSM4ZQSB7KA>
.
|
I had a similar issue. I was processing a 1024 pixel image (-max_size = 1024) and at about Scale 11, it crashed with the CUDA memory error. I have gone back to 512. The compute node being used is: https://www.nvidia.com/en-gb/geforce/graphics-cards/geforce-gtx-1080-ti/specifications/ |
@metaphorz How did you go back 512 and where is code for fix? |
This is so long ago I've forgotten. Been using Stable Diffusion through A1111 for most software runs. |
Can someone help me how to solve the "CUDA out of memory" error ? I think it has to do something with reducing the batch size but I am not sure where in the code I can do that. Here is the full error message
Traceback (most recent call last):
File "main_train.py", line 29, in
train(opt, Gs, Zs, reals, NoiseAmp)
File "c:\Projects\PK\Phd\Paper4_GAN\SinGAN-master\SinGAN\training.py", line 39, in train
z_curr,in_s,G_curr = train_single_scale(D_curr,G_curr,reals,Gs,Zs,in_s,NoiseAmp,opt)
File "c:\Projects\PK\Phd\Paper4_GAN\SinGAN-master\SinGAN\training.py", line 162, in train_single_scale
gradient_penalty.backward()
File "c:\ProgramData\Anaconda3\envs\torch\lib\site-packages\torch\tensor.py", line 195, in backward
torch.autograd.backward(self, gradient, retain_graph, create_graph)
File "c:\ProgramData\Anaconda3\envs\torch\lib\site-packages\torch\autograd_init_.py", line 99, in backward
allow_unreachable=True) # allow_unreachable flag
RuntimeError: CUDA out of memory. Tried to allocate 30.00 MiB (GPU 0; 2.00 GiB total capacity; 1.16 GiB already allocated; 18.86 MiB free; 1.28 GiB reserved in total by PyTorch)
The text was updated successfully, but these errors were encountered: