Open
Description
When running the BraTS 2021 notebook (located at PyTorch/Segmentation/nnUNet/notebooks/BraTS21.ipynb) training section, the model is not properly training even though it is going through the steps, as seen in the image below. The Dice is stuck at an extremely low value and neither that nor the loss changes at all over the epochs. The "DALI iterator does not support resetting while epoch is not finished" warning comes up on every epoch but that is not something that I have touched.
To Reproduce
Steps to reproduce the behavior:
- Clone the DeepLearningExamples repo and Install the dependencies
- Download the BraTS 2021 dataset
- Change paths in the BraTS 2021 notebook to point to file locations
- Run all of the steps up to and including the training stage
Expected behavior
I expected the model to train and have at least a Dice of 70 after 5 epochs
Environment
Please provide at least:
- PyTorch version: 1.13.1+cu116
- GPUs in the system: 2x Tesla V100-SXM2-16GB:
- CUDA driver version 515.86.01: