Reproducibility of two YOLOv5 identical train jobs #31

valentinitnelav · 2022-07-08T07:35:41Z

Hi @stark-t , I run two identical nano models on the Clara cluster and the results are a bit different.
Below you can look at the confusion matrices on the validation dataset. You also find the results.csv for each run at the bottom of this comment.

I personally do not like to see those differences in the two identical nano runs (but I can learn to accept it :D ). Not sure how to set a seed for yolov5 so that two runs of the same model are identical, or if that is even possible with the current configuration. Sadly we didn't see yet any parameters implemented with argparse to take a seed. There is a discussion here ultralytics/yolov5#1222 pointing at PyTorch reproducibility issue https://pytorch.org/docs/stable/notes/randomness.html

The main takes are:

Completely reproducible results are not guaranteed across PyTorch releases, individual commits, or different platforms. Furthermore, results may not be reproducible between CPU and GPU executions, even when using identical seeds.

The only method I'm aware of that might guarantee you identical results might be to train on CPU with --workers 0, but this is impractical naturally, so you simply need to adapt your workflow to accommodate minor variations in final model results.

nano model n1

nano model n2

small model s

Results csv files

nano model n1: results.csv

nano model n2: results.csv

small model s: results.csv

The text was updated successfully, but these errors were encountered:

valentinitnelav · 2022-09-02T12:35:46Z

FYI

Hi @stark-t , just realized that YOLOv5 made some updates (release v6.2) and one interesting aspect is the reproducibility with a --seed argument. I just read this:

Training Reproducibility: Single-GPU YOLOv5 training with torch>=1.12.0 is now fully reproducible, and a new --seed argument can be used (default seed=0) (ultralytics/yolov5#8213 by @AyushExel).

https://github.com/ultralytics/yolov5/releases/tag/v6.2

I didn't check yet all the details, but this might not work for parallel GPUs as we use on the cluster. Sounds like it works only for a single GPU.

valentinitnelav · 2022-11-22T11:22:32Z

Hi @stark-t ,

On the project with Malika and run again into this reproducibility problem and not sure what I'm doing wrong and how to solve it :/
I opened an issue on the github repo of YOLOv7, here WongKinYiu/yolov7#1144
Meanwhile, if you know what might cause this, please let me know.

If running two identical models on multiple GPUs is inherently not reproducible and the results can be so different, then I m not sure how to properly compare different model architectures or parameters.

FYI, I found these posts interesting to read:

Could machine learning fuel a reproducibility crisis in science?; Nature, 26 July 2022

Artificial intelligence faces reproducibility crisis; Science, 16 Feb 2018

valentinitnelav · 2022-11-23T14:48:23Z

This seems be an issue with detectron2 as well: facebookresearch/detectron2#4260

valentinitnelav · 2022-12-01T13:26:23Z

Hi @stark-t ,

I run multiple tests and I discovered that with the release v.6.2 of YOLOv5 one can get reproducible results but only when using a single GPU.
YOLOv7 has a reproducibility problem both when using a single GPU or multiple GPUs at once.

Since we compare different models some with YOLOv5 and some with YOLOv7 not sure how to have a fully reproducible comparison.

An alternative is to run say 5 times a model, so that we have 5 values for each metrics and then compare averages of these values. Does that make sense?
What do you think?

valentinitnelav · 2022-12-22T13:59:48Z

For the purpose of this paper, it would be too computationally expensive to train the models several times. I'll close this issue here. Hopefully YOLOv and v7 will allow reproducibility in the future when trained in parallel as well.

valentinitnelav closed this as completed Dec 22, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reproducibility of two YOLOv5 identical train jobs #31

Reproducibility of two YOLOv5 identical train jobs #31

valentinitnelav commented Jul 8, 2022 •

edited

Loading

valentinitnelav commented Sep 2, 2022

valentinitnelav commented Nov 22, 2022

valentinitnelav commented Nov 23, 2022

valentinitnelav commented Dec 1, 2022

valentinitnelav commented Dec 22, 2022

Reproducibility of two YOLOv5 identical train jobs #31

Reproducibility of two YOLOv5 identical train jobs #31

Comments

valentinitnelav commented Jul 8, 2022 • edited Loading

nano model n1

nano model n2

small model s

Results csv files

valentinitnelav commented Sep 2, 2022

valentinitnelav commented Nov 22, 2022

valentinitnelav commented Nov 23, 2022

valentinitnelav commented Dec 1, 2022

valentinitnelav commented Dec 22, 2022

valentinitnelav commented Jul 8, 2022 •

edited

Loading