[Pytorch] If I save output to CPU from a GPU tensor, GPU runs out of memory. #1087

guangster · 2021-09-17T18:44:28Z

guangster
Sep 17, 2021

When training a JitModule object, I noticed my GPU memory usage will keep creeping up.
It will not go past my maximum GPU memory, and I can train for hours without an issue.

However, when I add a line of code to dump some of my predicted output to CPU, I get random errors after a long time: either I will get a fatal crash without any Java error messages, or a Java exception telling me that my GPU has run out of memory.

        DeviceGuard cpu = new DeviceGuard(new Device("cpu"));
        DeviceGuard g0 = new DeviceGuard(new Device("cuda:0"));
        ...
        Tensor preds = model.forward(input); // everything here is on g0.current_device(), but I don't get error without the next line.
        Tensor predsCPU = preds.to(cpu.current_device(), preds.scalar_type());    // if I remove this and below line, error is gone.
        ...
        predsCPU.close(); // if I remove this and above line, error is gone.

The behaviour is very different compared to when I forget to close an intermediate tensor. In that case every iteration there would be a chunk of memory increasing. Here, what happens instead is that sometimes I will see the GPU memory remaining constant for a few seconds (during which dozens of iteration have passed), then creep up randomly after a while. It will crash after a lot of updates but the numbers are not consistent (sometimes after 500 or so updates, sometimes after 700 or so updates).

Any idea what may be happening?
Thanks!

Answered by saudet

Sep 18, 2021

There are probably some things that are not getting deallocated properly. Try to enclose each iteration in a PointerScope like this: http://bytedeco.org/news/2018/07/17/bytedeco-as-distribution/

View full answer

saudet · 2021-09-18T01:02:15Z

saudet
Sep 18, 2021
Maintainer

There are probably some things that are not getting deallocated properly. Try to enclose each iteration in a PointerScope like this: http://bytedeco.org/news/2018/07/17/bytedeco-as-distribution/

1 reply

guangster Sep 28, 2021
Author

Thanks again! After wrapping the training and validation calls within a try (PointerScope scope = new PointerScope()) clause I can see the GPU memory being very stable!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Pytorch] If I save output to CPU from a GPU tensor, GPU runs out of memory. #1087

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 1 comment 1 reply

{{title}}

{{title}}

Select a reply

[Pytorch] If I save output to CPU from a GPU tensor, GPU runs out of memory. #1087

guangster Sep 17, 2021

Replies: 1 comment · 1 reply

saudet Sep 18, 2021 Maintainer

guangster Sep 28, 2021 Author

guangster
Sep 17, 2021

Replies: 1 comment 1 reply

saudet
Sep 18, 2021
Maintainer

guangster Sep 28, 2021
Author