-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reproducibility of two YOLOv5 identical train jobs #31
Comments
FYI Hi @stark-t , just realized that YOLOv5 made some updates (release v6.2) and one interesting aspect is the reproducibility with a
https://github.com/ultralytics/yolov5/releases/tag/v6.2 I didn't check yet all the details, but this might not work for parallel GPUs as we use on the cluster. Sounds like it works only for a single GPU. |
Hi @stark-t , On the project with Malika and run again into this reproducibility problem and not sure what I'm doing wrong and how to solve it :/ If running two identical models on multiple GPUs is inherently not reproducible and the results can be so different, then I m not sure how to properly compare different model architectures or parameters. FYI, I found these posts interesting to read: Could machine learning fuel a reproducibility crisis in science?; Nature, 26 July 2022 Artificial intelligence faces reproducibility crisis; Science, 16 Feb 2018 |
This seems be an issue with detectron2 as well: facebookresearch/detectron2#4260 |
Hi @stark-t , I run multiple tests and I discovered that with the release v.6.2 of YOLOv5 one can get reproducible results but only when using a single GPU. Since we compare different models some with YOLOv5 and some with YOLOv7 not sure how to have a fully reproducible comparison. An alternative is to run say 5 times a model, so that we have 5 values for each metrics and then compare averages of these values. Does that make sense? |
For the purpose of this paper, it would be too computationally expensive to train the models several times. I'll close this issue here. Hopefully YOLOv and v7 will allow reproducibility in the future when trained in parallel as well. |
Hi @stark-t , I run two identical nano models on the Clara cluster and the results are a bit different.
Below you can look at the confusion matrices on the validation dataset. You also find the results.csv for each run at the bottom of this comment.
I personally do not like to see those differences in the two identical nano runs (but I can learn to accept it :D ). Not sure how to set a seed for yolov5 so that two runs of the same model are identical, or if that is even possible with the current configuration. Sadly we didn't see yet any parameters implemented with
argparse
to take a seed. There is a discussion here ultralytics/yolov5#1222 pointing at PyTorch reproducibility issue https://pytorch.org/docs/stable/notes/randomness.htmlThe main takes are:
nano model n1
nano model n2
small model s
Results csv files
nano model n1: results.csv
nano model n2: results.csv
small model s: results.csv
The text was updated successfully, but these errors were encountered: