Skip to content
This repository has been archived by the owner on May 5, 2023. It is now read-only.

RuntimeError: CUDA out of memory #74

Open
issmirnov opened this issue Nov 1, 2021 · 1 comment
Open

RuntimeError: CUDA out of memory #74

issmirnov opened this issue Nov 1, 2021 · 1 comment

Comments

@issmirnov
Copy link

Hey @paulbricman, super cool project! I ran it per instructions, but had issues with the training model:

image

Text:

RuntimeError                              Traceback (most recent call last)

<ipython-input-8-3fe4c666eb07> in <module>()
----> 1 output = trainer.train()

10 frames

/usr/local/lib/python3.7/dist-packages/torch/nn/functional.py in cross_entropy(input, target, weight, size_average, ignore_index, reduce, reduction)
   2822     if size_average is not None or reduce is not None:
   2823         reduction = _Reduction.legacy_get_string(size_average, reduce)
-> 2824     return torch._C._nn.cross_entropy_loss(input, target, weight, _Reduction.get_enum(reduction), ignore_index)
   2825 
   2826 

RuntimeError: CUDA out of memory. Tried to allocate 920.00 MiB (GPU 0; 11.17 GiB total capacity; 8.44 GiB already allocated; 377.81 MiB free; 10.30 GiB reserved in total by PyTorch)

I set the settings to medium model (8+gb RAM) and my vault size is listed as "ideal".

Any thoughts on what I can tweak?

@issmirnov
Copy link
Author

Update: I have paid for a Colab Pro plan ($10/mo) and ensured that my instance was GPU optimized and had 25gb of RAM. This was enough to let the model complete training.

Note: Samples per second was 11.727, compared to 0.87 on the free plan. It may be worth having a note in the readme suggesting people pay for the paid plan.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant