extra process when running ddp across multiple GPUs #9864
Unanswered
ChanganVR
asked this question in
DDP / multi-GPU / multi-node
Replies: 2 comments 5 replies
-
I did experience a similar issue and also encountered the problem of having one extra process that takes around 1GB of GPU memory for every additional GPU that I used during training. The issue was solved by setting auto_select_gpus=False for the initialization of the trainer class. |
Beta Was this translation helpful? Give feedback.
1 reply
-
I now also experience an unwanted phenomena, where the gpu:0 gets an extra task for each gpu used in multi-gpu training (see image): Does anyone know why this might occur? |
Beta Was this translation helpful? Give feedback.
4 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi there,
When I run DPP across multiple GPUs, I always see additional processes for each additional GPU. For example, when I use 2 GPUs, on each GPU, there is one process for the training and one extra process, which takes about 1GB. In my understanding, this extra process is for processing gradient sync. Is this behavior expected? And is there a way to avoid the extra processes because it really limits the ability to use more than 8 GPUs.
Below is the utilization of GPU memory when I use two GPUs:
Below is the utilization of GPU memory when I use eight GPUs:
Any feedback or suggestion is appreciated. Thank you!
Beta Was this translation helpful? Give feedback.
All reactions