Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CUDA error: an illegal memory access was encountered #8

Open
yushanshan05 opened this issue Feb 24, 2020 · 3 comments
Open

CUDA error: an illegal memory access was encountered #8

yushanshan05 opened this issue Feb 24, 2020 · 3 comments

Comments

@yushanshan05
Copy link

hi, thanks for you great works.
I train my dataset, which has ten classes, fps =1, and I don't add --fp16 flag.
max_iter=2
batch_size=2

But when I start training, there will be the error. This error happens during the third itertator. That means it is ok during the first and the second iterator. The model can forward,backforward and the function of optimizer.step is ok during the first and the second iterator. When the third itertator starts, there throw the error:
Traceback (most recent call last):
File "train.py", line 602, in
main()
File "train.py", line 235, in main
train(args, nets, optimizer, scheduler, train_dataloader, val_dataloader, log_file)
File "train.py", line 362, in train
optimizer.step()
File "/usr/local/lib/python3.6/dist-packages/torch/optim/lr_scheduler.py", line 51, in wrapper
return wrapped(*args, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/torch/optim/adam.py", line 103, in step
denom = (exp_avg_sq.sqrt() / math.sqrt(bias_correction2)).add_(group['eps'])
RuntimeError: CUDA error: an illegal memory access was encountered

@Avashist1998
Copy link

I am facing the same issue which working on the SPADE code.

Traceback (most recent call last):
File "train.py", line 40, in
trainer.run_generator_one_step(data_i)
File "/home/abhay/inpaint-sa/trainers/pix2pix_trainer.py", line 38, in run_generator_one_step
self.optimizer_G.step()
File "/home/abhay/miniconda3/envs/pytorch36/lib/python3.6/site-packages/torch/autograd/grad_mode.py", line 26, in decorate_context
return func(*args, **kwargs)
File "/home/abhay/miniconda3/envs/pytorch36/lib/python3.6/site-packages/torch/optim/adam.py", line 111, in step
denom = (exp_avg_sq.sqrt() / math.sqrt(bias_correction2)).add_(group['eps'])
RuntimeError: CUDA error: an illegal memory access was encountered

@mathshangw
Copy link

Excuse me did you solve it

@Avashist1998
Copy link

For me I was a hardware issue. The gpu was getting too hot and crashing, since the fans would not get triggered at higher temperatures.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants