Multi processing doesnt work! #15880

vadinabronin · 2022-12-01T10:19:27Z

vadinabronin
Dec 1, 2022

Why didn't any idi*t ask the question: I have a trainer:

trainer = pl.Trainer(accelerator = 'gpu',devices = 2, max_epochs = cfg.epoch,
logger=pl.loggers.CSVLogger(save_dir="logs/"), precision = 16,strategy =
'dp')

And pytorch lightning module(i dont share this module because it is to big)

And function which compute distilation feature map loss(function is declared outside of my module)

and stupid dp strategy can not fix devices on in my function i get this exception when i learn my model

RuntimeError: Caught RuntimeError in replica 0 on device 0.
Original Traceback (most recent call last):
File "/opt/conda/lib/python3.7/site-packages/torch/nn/parallel/parallel_apply.py", line 61, in _worker
output = module(*input, **kwargs)
File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
return forward_call(*input, **kwargs)
File "/opt/conda/lib/python3.7/site-packages/pytorch_lightning/overrides/data_parallel.py", line 65, in forward
output = super().forward(*inputs, **kwargs)
File "/opt/conda/lib/python3.7/site-packages/pytorch_lightning/overrides/base.py", line 79, in forward
output = self.module.training_step(*inputs, **kwargs)
File "/tmp/ipykernel_17/687536527.py", line 43, in training_step
distill_loss = self.atloss.forward()
File "/tmp/ipykernel_17/125274374.py", line 57, in forward
teacher_feature_map)
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cuda:1!

mBerezovskyy · 2024-02-06T22:49:52Z

mBerezovskyy
Feb 6, 2024

@vadinabronin did you manage to find the solution? I am facing the same problem

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Multi processing doesnt work! #15880

{{title}}

Replies: 1 comment

{{title}}

Select a reply

Multi processing doesnt work! #15880

vadinabronin Dec 1, 2022

Replies: 1 comment

mBerezovskyy Feb 6, 2024

vadinabronin
Dec 1, 2022

mBerezovskyy
Feb 6, 2024