You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, I tried to run training but ran into the following error:
Using device: cuda:0
#######################################################################
Please cite the following paper when using nnU-Net:
Isensee, F., Jaeger, P. F., Kohl, S. A., Petersen, J., & Maier-Hein, K. H. (2021). nnU-Net: a self-configuring method for deep learning-based biomedical image segmentation. Nature methods, 18(2), 203-211.
#######################################################################
This is the configuration used by this training:
Configuration name: 3d_lowres
{'data_identifier': 'nnUNetPlans_3d_lowres', 'preprocessor_name': 'DefaultPreprocessor', 'batch_size': 2, 'patch_size': [80, 192, 160], 'median_image_size_in_voxels': [126, 275, 275], 'spacing': [1.1161767430256981, 0.6576431982019902, 0.6576431982019902], 'normalization_schemes': ['CTNormalization'], 'use_mask_for_norm': [False], 'UNet_class_name': 'PlainConvUNet', 'UNet_base_num_features': 32, 'n_conv_per_stage_encoder': [2, 2, 2, 2, 2, 2], 'n_conv_per_stage_decoder': [2, 2, 2, 2, 2], 'num_pool_per_axis': [4, 5, 5], 'pool_op_kernel_sizes': [[1, 1, 1], [2, 2, 2], [2, 2, 2], [2, 2, 2], [2, 2, 2], [1, 2, 2]], 'conv_kernel_sizes': [[3, 3, 3], [3, 3, 3], [3, 3, 3], [3, 3, 3], [3, 3, 3], [3, 3, 3]], 'unet_max_num_features': 320, 'resampling_fn_data': 'resample_data_or_seg_to_shape', 'resampling_fn_seg': 'resample_data_or_seg_to_shape', 'resampling_fn_data_kwargs': {'is_seg': False, 'order': 3, 'order_z': 0, 'force_separate_z': None}, 'resampling_fn_seg_kwargs': {'is_seg': True, 'order': 1, 'order_z': 0, 'force_separate_z': None}, 'resampling_fn_probabilities': 'resample_data_or_seg_to_shape', 'resampling_fn_probabilities_kwargs': {'is_seg': False, 'order': 1, 'order_z': 0, 'force_separate_z': None}, 'batch_dice': False, 'next_stage': '3d_cascade_fullres'}
These are the global plan.json settings:
{'dataset_name': 'Dataset111_lv', 'plans_name': 'nnUNetPlans', 'original_median_spacing_after_transp': [0.6, 0.353515625, 0.353515625], 'original_median_shape_after_transp': [230, 512, 512], 'image_reader_writer': 'SimpleITKIO', 'transpose_forward': [0, 1, 2], 'transpose_backward': [0, 1, 2], 'experiment_planner_used': 'ExperimentPlanner', 'label_manager': 'LabelManager', 'foreground_intensity_properties_per_channel': {'0': {'max': 2150.0, 'mean': 112.53475952148438, 'median': 105.0, 'min': -2048.0, 'percentile_00_5': -41.0, 'percentile_99_5': 388.0, 'std': 65.23908233642578}}}
2023-11-21 17:10:10.291010: unpacking dataset...
2023-11-21 17:10:12.882671: unpacking done...
2023-11-21 17:10:12.883862: do_dummy_2d_data_aug: False
2023-11-21 17:10:12.884239: Using splits from existing split file: /home/app/output/preprocessed_data/Dataset111_lv/splits_final.json
2023-11-21 17:10:12.884593: The split file contains 5 splits.
2023-11-21 17:10:12.884640: Desired fold for training: 0
2023-11-21 17:10:12.884670: This split has 27 training and 7 validation cases.
2023-11-21 17:10:12.896909: Unable to plot network architecture:
2023-11-21 17:10:12.896971: No module named 'hiddenlayer'
2023-11-21 17:10:12.910495:
2023-11-21 17:10:12.910562: Epoch 0
2023-11-21 17:10:12.910645: Current learning rate: 0.01
Exception in thread Thread-4:
Traceback (most recent call last):
File "/opt/conda/envs/nnunet/lib/python3.9/threading.py", line 980, in _bootstrap_inner
self.run()
File "/opt/conda/envs/nnunet/lib/python3.9/threading.py", line 917, in run
self._target(*self._args, **self._kwargs)
File "/opt/conda/envs/nnunet/lib/python3.9/site-packages/batchgenerators/dataloading/nondet_multi_threaded_augmenter.py", line 125, in results_loop
raise e
File "/opt/conda/envs/nnunet/lib/python3.9/site-packages/batchgenerators/dataloading/nondet_multi_threaded_augmenter.py", line 103, in results_loop
raise RuntimeError("One or more background workers are no longer alive. Exiting. Please check the "
RuntimeError: One or more background workers are no longer alive. Exiting. Please check the print statements above for the actual error message
using pin_memory on device 0
Traceback (most recent call last):
File "/opt/conda/envs/nnunet/bin/nnUNetv2_train", line 8, in <module>
sys.exit(run_training_entry())
File "/home/app/nnUNet/nnunetv2/run/run_training.py", line 268, in run_training_entry
run_training(args.dataset_name_or_id, args.configuration, args.fold, args.tr, args.p, args.pretrained_weights,
File "/home/app/nnUNet/nnunetv2/run/run_training.py", line 204, in run_training
nnunet_trainer.run_training()
File "/home/app/nnUNet/nnunetv2/training/nnUNetTrainer/nnUNetTrainer.py", line 1242, in run_training
train_outputs.append(self.train_step(next(self.dataloader_train)))
File "/opt/conda/envs/nnunet/lib/python3.9/site-packages/batchgenerators/dataloading/nondet_multi_threaded_augmenter.py", line 196, in __next__
item = self.__get_next_item()
File "/opt/conda/envs/nnunet/lib/python3.9/site-packages/batchgenerators/dataloading/nondet_multi_threaded_augmenter.py", line 181, in __get_next_item
raise RuntimeError("One or more background workers are no longer alive. Exiting. Please check the "
RuntimeError: One or more background workers are no longer alive. Exiting. Please check the print statements above for the actual error message
I am running WSL2 on windows, with 120GB memory, 30GB swap and cuda RTX3090 GPU. My preprocessing worked, and my dataset only contains 34 nrrd files for image and label. Could you please help me solve this issue? Thanks!
Also, I tried setting worker number to 1 by using export nnUNet_n_proc_DA=1, which didn't solve the error.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
Hi, I tried to run training but ran into the following error:
I am running WSL2 on windows, with 120GB memory, 30GB swap and cuda RTX3090 GPU. My preprocessing worked, and my dataset only contains 34 nrrd files for image and label. Could you please help me solve this issue? Thanks!
Also, I tried setting worker number to 1 by using
export nnUNet_n_proc_DA=1
, which didn't solve the error.Beta Was this translation helpful? Give feedback.
All reactions