Running On Multiple GPUs #27

playmakerbugger · 2021-06-23T13:51:57Z

Hi I am running the image harmonization part of the model with a --train_stages 6 --max_size 350 and --lr_scale 0.5 to increase the quality of the images.

However, once I get to the 2 stage of the training, it crashes because of lack of CUDA memory. I altered the torch device for the model to accept more than 1 gpu (let's say gpus 0 and 1) and made changes to the model to be encapsulated in a DataParallel model so that it can run parallel on multiple GPUs. However, it still only runs on 1 GPU.

Do you have any suggestions to fix this issue?

tohinz · 2021-06-23T14:15:27Z

Without seeing the code it's difficult.
Have you changed how the parameter --gpu is handled (in the main_train.py file)?
By default it's set to 0 and later in the code we do (see here)
if torch.cuda.is_available(): torch.cuda.set_device(opt.gpu)
You might have to change that to get it to work.

playmakerbugger · 2021-06-24T12:07:51Z

Hi, still having the problem. I changed that line to equal a torch device of two gpus (passed in set_device). It still runs on 1 gpu.

tohinz · 2021-06-30T09:46:21Z

Sorry for the late response.
What kind of GPU are you running this on and how much VRAM does it have?
I run all of my experiments on a single GPU with ~12GB VRAM without problems.

playmakerbugger · 2021-07-07T17:12:13Z

GPU 0 with about 30000 MiB

Liz1317 · 2022-07-16T05:50:47Z

I have the same problem. Is there any solution?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Running On Multiple GPUs #27

Running On Multiple GPUs #27

playmakerbugger commented Jun 23, 2021

tohinz commented Jun 23, 2021 •

edited

Loading

playmakerbugger commented Jun 24, 2021

tohinz commented Jun 30, 2021

playmakerbugger commented Jul 7, 2021

Liz1317 commented Jul 16, 2022

Running On Multiple GPUs #27

Running On Multiple GPUs #27

Comments

playmakerbugger commented Jun 23, 2021

tohinz commented Jun 23, 2021 • edited Loading

playmakerbugger commented Jun 24, 2021

tohinz commented Jun 30, 2021

playmakerbugger commented Jul 7, 2021

Liz1317 commented Jul 16, 2022

tohinz commented Jun 23, 2021 •

edited

Loading