Emulating multiple devices with a single GPU #8630
-
Hello, I have a single GPU, but I would like to spawn multiple replicas on that single GPU and train a model with DDP. Of course, each replica would have to use a smaller batch size in order to fit in memory. (For my use case, I am not interested in having a single replica with a large batch size). I tried to pass
But in the end it crashed with Please, is there any way to split a single GPU into multiple replicas with Lightning? P.S.: Ray has a really nice support for fractional GPUs: https://docs.ray.io/en/master/using-ray-with-gpus.html#fractional-gpus. I've never used them with Lightning, but maybe it could be a workaround? |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 10 replies
-
Hmm interesting use case. AFAIU it is not possible, at least with From https://docs.nvidia.com/deeplearning/nccl/user-guide/docs/usage/communicators.html:
It probably can be done if you write a custom gradient sync'ing logic, which moves gradient to RAM before sync'ing and sync them with a |
Beta Was this translation helpful? Give feedback.
-
For reference: it seems to be possible when the backend is |
Beta Was this translation helpful? Give feedback.
For reference: it seems to be possible when the backend is
gloo
instead ofnccl
. See discussion here: #8630 (reply in thread).