You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
J'ai d'abord cru que c'était du à la configuration par défaut des DataLoader pour laquelle drop_last=False. Mais l'erreur n'apparaît pas en fin d'epoch mais avant : à 93% de la donnée (et on est bien au training_step d'après les logs)
ValueError: Expected more than 1 value per channel when training, got input size torch.Size([1, 512])
Un seul nuage dans le batch -> Ça c'est inattendu : ça doit pouvoir arriver quand on part de tuiles de 1km*1km et qu'on ignore des patches de données trop petites. Par contre ça ne devrait pas arriver sur un dataset issu de pacasam (ce qui est le cas ici). Explication pas hyper convaincante : on était bien à la fin de l'epoch et les logs sont simplement "en retard".
Bon, sans avoir l'explication de pourquoi la situation se présente ici, elle peut se présenter sur des données normalesqu'on découpe à la volée donc ça vaut le coup de s'en soucier.
Déjà on peut commencer par passer drop_last=True
Ensuite on peut éditer GeometricNoneProofCollater pour vérifier la validité du batch.
EDIT: on répare le subsampling qui a lieu dans MinimumNumNodes.
Autrement : dans cette situation le bloc qui n'accepte pas un unique point est mlp_summit. Peut-être que garder minimum deux points dans la décimation qui précède directement mlp_summit est une solution plus efficace. Dans ce cas ça se passe dans ces lignes :
On passe de
# Decimation should not empty clouds completely.decimated_bincount=torch.max(
torch.ones_like(decimated_bincount), decimated_bincount
)
à
# Decimation should not empty clouds completely.decimated_bincount=torch.max(
2*torch.ones_like(decimated_bincount), decimated_bincount
)
sans impact sur le comportement normal du modèle.
Trace complète :
Epoch 43: 93%|█████████▎| 1396/1501 [21:46<01:38, 1.07it/s, loss=0.183, v_num=d139, train/iou_step=0.687, val/iou_step=0.675, val/iou_epoch=0.701, train/iou_epoch=0.729]
Epoch 43: 93%|█████████▎| 1396/1501 [21:46<01:38, 1.07it/s, loss=0.186, v_num=d139, train/iou_step=0.424, val/iou_step=0.675, val/iou_epoch=0.701, train/iou_epoch=0.729]
Epoch 43: 93%|█████████▎| 1397/1501 [21:47<01:37, 1.07it/s, loss=0.186, v_num=d139, train/iou_step=0.424, val/iou_step=0.675, val/iou_epoch=0.701, train/iou_epoch=0.729]
Epoch 43: 93%|█████████▎| 1397/1501 [21:47<01:37, 1.07it/s, loss=0.183, v_num=d139, train/iou_step=0.547, val/iou_step=0.675, val/iou_epoch=0.701, train/iou_epoch=0.729]
Epoch 43: 93%|█████████▎| 1398/1501 [21:48<01:36, 1.07it/s, loss=0.183, v_num=d139, train/iou_step=0.547, val/iou_step=0.675, val/iou_epoch=0.701, train/iou_epoch=0.729]
Epoch 43: 93%|█████████▎| 1398/1501 [21:48<01:36, 1.07it/s, loss=0.184, v_num=d139, train/iou_step=0.535, val/iou_step=0.675, val/iou_epoch=0.701, train/iou_epoch=0.729]
Epoch 43: 93%|█████████▎| 1399/1501 [21:49<01:35, 1.07it/s, loss=0.184, v_num=d139, train/iou_step=0.535, val/iou_step=0.675, val/iou_epoch=0.701, train/iou_epoch=0.729]
Epoch 43: 93%|█████████▎| 1399/1501 [21:49<01:35, 1.07it/s, loss=0.183, v_num=d139, train/iou_step=0.693, val/iou_step=0.675, val/iou_epoch=0.701, train/iou_epoch=0.729]
Epoch 43: 93%|█████████▎| 1400/1501 [21:49<01:34, 1.07it/s, loss=0.183, v_num=d139, train/iou_step=0.693, val/iou_step=0.675, val/iou_epoch=0.701, train/iou_epoch=0.729]
Epoch 43: 93%|█████████▎| 1400/1501 [21:49<01:34, 1.07it/s, loss=0.189, v_num=d139, train/iou_step=0.479, val/iou_step=0.675, val/iou_epoch=0.701, train/iou_epoch=0.729]
Epoch 43: 93%|█████████▎| 1401/1501 [21:51<01:33, 1.07it/s, loss=0.189, v_num=d139, train/iou_step=0.479, val/iou_step=0.675, val/iou_epoch=0.701, train/iou_epoch=0.729]
Epoch 43: 93%|█████████▎| 1401/1501 [21:51<01:33, 1.07it/s, loss=0.18, v_num=d139, train/iou_step=0.394, val/iou_step=0.675, val/iou_epoch=0.701, train/iou_epoch=0.729] Error executing job with overrides: ['task.task_name=fit', 'datamodule.hdf5_file_path=/var/data/CGaydon/myria3d_datasets/20230727_75km2_diverse.hdf5', 'dataset_description=20230601_lidarhd_pacasam_dataset', 'datamodule.tile_width=50', 'experiment=RandLaNet_base_run_FR-MultiGPU', 'logger.comet.experiment_name=20230727_75km2_diverse-2GPUS', 'trainer.gpus=[0,1]', 'trainer.min_epochs=300', 'trainer.max_epochs=300']
Traceback (most recent call last):
File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 685, in _call_and_handle_interrupt
return trainer_fn(*args, **kwargs)
File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 777, in _fit_impl
self._run(model, ckpt_path=ckpt_path)
File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 1199, in _run
self._dispatch()
File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 1279, in _dispatch
self.training_type_plugin.start_training(self)
File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/pytorch_lightning/plugins/training_type/training_type_plugin.py", line 202, in start_training
self._results = trainer.run_stage()
File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 1289, in run_stage
return self._run_train()
File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 1319, in _run_train
self.fit_loop.run()
File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/pytorch_lightning/loops/base.py", line 145, in run
self.advance(*args, **kwargs)
File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/pytorch_lightning/loops/fit_loop.py", line 234, in advance
self.epoch_loop.run(data_fetcher)
File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/pytorch_lightning/loops/base.py", line 145, in run
self.advance(*args, **kwargs)
File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/pytorch_lightning/loops/epoch/training_epoch_loop.py", line 193, in advance
batch_output = self.batch_loop.run(batch, batch_idx)
File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/pytorch_lightning/loops/base.py", line 145, in run
self.advance(*args, **kwargs)
File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/pytorch_lightning/loops/batch/training_batch_loop.py", line 88, in advance
outputs = self.optimizer_loop.run(split_batch, optimizers, batch_idx)
File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/pytorch_lightning/loops/base.py", line 145, in run
self.advance(*args, **kwargs)
File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/pytorch_lightning/loops/optimization/optimizer_loop.py", line 215, in advance
result = self._run_optimization(
File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/pytorch_lightning/loops/optimization/optimizer_loop.py", line 266, in _run_optimization
self._optimizer_step(optimizer, opt_idx, batch_idx, closure)
File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/pytorch_lightning/loops/optimization/optimizer_loop.py", line 378, in _optimizer_step
lightning_module.optimizer_step(
File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/pytorch_lightning/core/lightning.py", line 1652, in optimizer_step
optimizer.step(closure=optimizer_closure)
File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/pytorch_lightning/core/optimizer.py", line 164, in step
trainer.accelerator.optimizer_step(self._optimizer, self._optimizer_idx, closure, **kwargs)
File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/pytorch_lightning/accelerators/accelerator.py", line 336, in optimizer_step
self.precision_plugin.optimizer_step(model, optimizer, opt_idx, closure, **kwargs)
File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/pytorch_lightning/plugins/precision/precision_plugin.py", line 163, in optimizer_step
optimizer.step(closure=closure, **kwargs)
File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/torch/optim/optimizer.py", line 88, in wrapper
return func(*args, **kwargs)
File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
return func(*args, **kwargs)
File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/torch/optim/adam.py", line 100, in step
loss = closure()
File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/pytorch_lightning/plugins/precision/precision_plugin.py", line 148, in _wrap_closure
closure_result = closure()
File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/pytorch_lightning/loops/optimization/optimizer_loop.py", line 160, in __call__
self._result = self.closure(*args, **kwargs)
File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/pytorch_lightning/loops/optimization/optimizer_loop.py", line 142, in closure
step_output = self._step_fn()
File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/pytorch_lightning/loops/optimization/optimizer_loop.py", line 435, in _training_step
training_step_output = self.trainer.accelerator.training_step(step_kwargs)
File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/pytorch_lightning/accelerators/accelerator.py", line 216, in training_step
return self.training_type_plugin.training_step(*step_kwargs.values())
File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/pytorch_lightning/plugins/training_type/ddp.py", line 439, in training_step
return self.model(*args, **kwargs)
File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
return forward_call(*input, **kwargs)
File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/torch/nn/parallel/distributed.py", line 963, in forward
output = self.module(*inputs[0], **kwargs[0])
File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
return forward_call(*input, **kwargs)
File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/pytorch_lightning/overrides/base.py", line 81, in forward
output = self.module.training_step(*inputs, **kwargs)
File "/home/CGaydon/repositories/myria3d/myria3d/models/model.py", line 139, in training_step
targets, logits = self.forward(batch)
File "/home/CGaydon/repositories/myria3d/myria3d/models/model.py", line 93, in forward
logits = self.model(batch.x, batch.pos, batch.batch, batch.ptr)
File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
return forward_call(*input, **kwargs)
File "/home/CGaydon/repositories/myria3d/myria3d/models/modules/pyg_randla_net.py", line 73, in forward
self.mlp_summit(b4_out_decimated[0]),
File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
return forward_call(*input, **kwargs)
File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/torch_geometric/nn/models/mlp.py", line 186, in forward
x = norm(x)
File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
return forward_call(*input, **kwargs)
File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/torch_geometric/nn/norm/batch_norm.py", line 45, in forward
return self.module(x)
File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
return forward_call(*input, **kwargs)
File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/torch/nn/modules/batchnorm.py", line 168, in forward
return F.batch_norm(
File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/torch/nn/functional.py", line 2419, in batch_norm
_verify_batch_size(input.size())
File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/torch/nn/functional.py", line 2387, in _verify_batch_size
raise ValueError("Expected more than 1 value per channel when training, got input size {}".format(size))
ValueError: Expected more than 1 value per channel when training, got input size torch.Size([1, 512])
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/CGaydon/repositories/myria3d/run.py", line 121, in <module>
launch_train()
File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/hydra/main.py", line 48, in decorated_main
_run_hydra(
File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/hydra/_internal/utils.py", line 377, in _run_hydra
run_and_report(
File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/hydra/_internal/utils.py", line 214, in run_and_report
raise ex
File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/hydra/_internal/utils.py", line 211, in run_and_report
return func()
File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/hydra/_internal/utils.py", line 378, in <lambda>
lambda: hydra.run(
File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/hydra/_internal/hydra.py", line 111, in run
_ = ret.return_value
File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/hydra/core/utils.py", line 233, in return_value
raise self._return_value
File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/hydra/core/utils.py", line 160, in run_job
ret.return_value = task_function(task_cfg)
File "/home/CGaydon/repositories/myria3d/run.py", line 57, in launch_train
return train(config)
File "/home/CGaydon/repositories/myria3d/myria3d/train.py", line 143, in train
trainer.fit(model=model, datamodule=datamodule, ckpt_path=config.model.ckpt_path)
File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 740, in fit
self._call_and_handle_interrupt(
File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 698, in _call_and_handle_interrupt
self.training_type_plugin.reconciliate_processes(traceback.format_exc())
File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/pytorch_lightning/plugins/training_type/ddp.py", line 533, in reconciliate_processes
raise DeadlockDetectedException(f"DeadLock detected from rank: {self.global_rank} \n {trace}")
pytorch_lightning.utilities.exceptions.DeadlockDetectedException: DeadLock detected from rank: 1
Traceback (most recent call last):
File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 685, in _call_and_handle_interrupt
return trainer_fn(*args, **kwargs)
File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 777, in _fit_impl
self._run(model, ckpt_path=ckpt_path)
File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 1199, in _run
self._dispatch()
File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 1279, in _dispatch
self.training_type_plugin.start_training(self)
File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/pytorch_lightning/plugins/training_type/training_type_plugin.py", line 202, in start_training
self._results = trainer.run_stage()
File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 1289, in run_stage
return self._run_train()
File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 1319, in _run_train
self.fit_loop.run()
File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/pytorch_lightning/loops/base.py", line 145, in run
self.advance(*args, **kwargs)
File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/pytorch_lightning/loops/fit_loop.py", line 234, in advance
self.epoch_loop.run(data_fetcher)
File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/pytorch_lightning/loops/base.py", line 145, in run
self.advance(*args, **kwargs)
File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/pytorch_lightning/loops/epoch/training_epoch_loop.py", line 193, in advance
batch_output = self.batch_loop.run(batch, batch_idx)
File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/pytorch_lightning/loops/base.py", line 145, in run
self.advance(*args, **kwargs)
File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/pytorch_lightning/loops/batch/training_batch_loop.py", line 88, in advance
outputs = self.optimizer_loop.run(split_batch, optimizers, batch_idx)
File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/pytorch_lightning/loops/base.py", line 145, in run
self.advance(*args, **kwargs)
File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/pytorch_lightning/loops/optimization/optimizer_loop.py", line 215, in advance
result = self._run_optimization(
File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/pytorch_lightning/loops/optimization/optimizer_loop.py", line 266, in _run_optimization
self._optimizer_step(optimizer, opt_idx, batch_idx, closure)
File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/pytorch_lightning/loops/optimization/optimizer_loop.py", line 378, in _optimizer_step
lightning_module.optimizer_step(
File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/pytorch_lightning/core/lightning.py", line 1652, in optimizer_step
optimizer.step(closure=optimizer_closure)
File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/pytorch_lightning/core/optimizer.py", line 164, in step
trainer.accelerator.optimizer_step(self._optimizer, self._optimizer_idx, closure, **kwargs)
File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/pytorch_lightning/accelerators/accelerator.py", line 336, in optimizer_step
self.precision_plugin.optimizer_step(model, optimizer, opt_idx, closure, **kwargs)
File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/pytorch_lightning/plugins/precision/precision_plugin.py", line 163, in optimizer_step
optimizer.step(closure=closure, **kwargs)
File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/torch/optim/optimizer.py", line 88, in wrapper
return func(*args, **kwargs)
File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
return func(*args, **kwargs)
File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/torch/optim/adam.py", line 100, in step
loss = closure()
File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/pytorch_lightning/plugins/precision/precision_plugin.py", line 148, in _wrap_closure
closure_result = closure()
File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/pytorch_lightning/loops/optimization/optimizer_loop.py", line 160, in __call__
self._result = self.closure(*args, **kwargs)
File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/pytorch_lightning/loops/optimization/optimizer_loop.py", line 142, in closure
step_output = self._step_fn()
File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/pytorch_lightning/loops/optimization/optimizer_loop.py", line 435, in _training_step
training_step_output = self.trainer.accelerator.training_step(step_kwargs)
File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/pytorch_lightning/accelerators/accelerator.py", line 216, in training_step
return self.training_type_plugin.training_step(*step_kwargs.values())
File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/pytorch_lightning/plugins/training_type/ddp.py", line 439, in training_step
return self.model(*args, **kwargs)
File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
return forward_call(*input, **kwargs)
File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/torch/nn/parallel/distributed.py", line 963, in forward
output = self.module(*inputs[0], **kwargs[0])
File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
return forward_call(*input, **kwargs)
File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/pytorch_lightning/overrides/base.py", line 81, in forward
output = self.module.training_step(*inputs, **kwargs)
File "/home/CGaydon/repositories/myria3d/myria3d/models/model.py", line 139, in training_step
targets, logits = self.forward(batch)
File "/home/CGaydon/repositories/myria3d/myria3d/models/model.py", line 93, in forward
logits = self.model(batch.x, batch.pos, batch.batch, batch.ptr)
File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
return forward_call(*input, **kwargs)
File "/home/CGaydon/repositories/myria3d/myria3d/models/modules/pyg_randla_net.py", line 73, in forward
self.mlp_summit(b4_out_decimated[0]),
File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
return forward_call(*input, **kwargs)
File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/torch_geometric/nn/models/mlp.py", line 186, in forward
x = norm(x)
File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
return forward_call(*input, **kwargs)
File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/torch_geometric/nn/norm/batch_norm.py", line 45, in forward
return self.module(x)
File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
return forward_call(*input, **kwargs)
File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/torch/nn/modules/batchnorm.py", line 168, in forward
return F.batch_norm(
File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/torch/nn/functional.py", line 2419, in batch_norm
_verify_batch_size(input.size())
File "/home/CGaydon/.conda/envs/myria3d/lib/python3.9/site-packages/torch/nn/functional.py", line 2387, in _verify_batch_size
raise ValueError("Expected more than 1 value per channel when training, got input size {}".format(size))
ValueError: Expected more than 1 value per channel when training, got input size torch.Size([1, 512])
The text was updated successfully, but these errors were encountered:
Sujet 1 : pourquoi un seul sample dans ce batch d'apprentissage
-> on ajoute drop_last à tout hasard.
-> l'erreur à l'apprentissage disparait 🥳
Sujet 2 : comment conserver la fonctionnalité "Prédire un unique nuage avec un unique point", situation qui peut arriver en inférence.
-> En fait on avait oublié la transform MinimumNumNodes. Celle-ci est supposée dupliquer des points pour éviter les erreurs au sein du modèle. MAIS : maintenant qu'on accepte les nuages de points ayant num_nodes=1, on rentre dans un cas limite de la fonction subsample_data qui est impliquée dans MinimumNumNodes. la condition and item.size(0) != 1: est atteinte ce qui empêche la transform d'être appliquée !
-> Pas de raison d'être de cette condition : elle avait été ajoutée à FixedPoints pour gérer un cas différent à la suite de cette discussion .
J'ai d'abord cru que c'était du à la configuration par défaut des DataLoader pour laquelle
drop_last=False
. Mais l'erreur n'apparaît pas en fin d'epoch mais avant : à 93% de la donnée (et on est bien au training_step d'après les logs)ValueError: Expected more than 1 value per channel when training, got input size torch.Size([1, 512])
La taille 1,512 m'évoque cependant un combo de :
Bon, sans avoir l'explication de pourquoi la situation se présente ici, elle peut se présenter sur des données normalesqu'on découpe à la volée donc ça vaut le coup de s'en soucier.
drop_last=True
Ensuite on peut éditer GeometricNoneProofCollater pour vérifier la validité du batch.Autrement : dans cette situation le bloc qui n'accepte pas un unique point estmlp_summit
. Peut-être que garder minimum deux points dans la décimation qui précède directement mlp_summit est une solution plus efficace. Dans ce cas ça se passe dans ces lignes :On passe deàsans impact sur le comportement normal du modèle.Trace complète :
The text was updated successfully, but these errors were encountered: