Replies: 1 comment 1 reply
-
There are probably some things that are not getting deallocated properly.
Try to enclose each iteration in a PointerScope like this:
http://bytedeco.org/news/2018/07/17/bytedeco-as-distribution/
|
Beta Was this translation helpful? Give feedback.
1 reply
Answer selected by
guangster
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
When training a JitModule object, I noticed my GPU memory usage will keep creeping up.
It will not go past my maximum GPU memory, and I can train for hours without an issue.
However, when I add a line of code to dump some of my predicted output to CPU, I get random errors after a long time: either I will get a fatal crash without any Java error messages, or a Java exception telling me that my GPU has run out of memory.
The behaviour is very different compared to when I forget to close an intermediate tensor. In that case every iteration there would be a chunk of memory increasing. Here, what happens instead is that sometimes I will see the GPU memory remaining constant for a few seconds (during which dozens of iteration have passed), then creep up randomly after a while. It will crash after a lot of updates but the numbers are not consistent (sometimes after 500 or so updates, sometimes after 700 or so updates).
Any idea what may be happening?
Thanks!
Beta Was this translation helpful? Give feedback.
All reactions