[Re] A Simple Framework for Contrastive Learning of Visual Representations #76

ADevillers · 2023-11-09T16:12:54Z

Original article: T. Chen, S. Kornblith, M. Norouzi, and G. Hinton. “A simple framework for contrastive learning of visual repre-
sentations.” In: International conference on machine learning. PMLR, 2020, pp. 1597–1607

PDF URL: https://github.com/ADevillers/SimCLR/blob/main/report.pdf
Metadata URL: https://github.com/ADevillers/SimCLR/blob/main/report.metadata.tex
Code URL: https://github.com/ADevillers/SimCLR/tree/main

Scientific domain: Representation Learning
Programming language: Python
Suggested editor: @rougier

rougier · 2023-11-22T07:35:28Z

Thanks for your submission and sorry for the delay. We'll assign an editor soon.

rougier · 2023-11-22T07:36:56Z

@gdetor @benoit-girard @koustuvsinha Can any of you edit this submission?

benoit-girard · 2023-11-22T14:45:08Z

I can do it!

benoit-girard · 2023-12-04T09:58:44Z

Good news: @charlypg has accepted to review this paper and its companion!

benoit-girard · 2023-12-04T10:08:35Z

@pps121 would you like to review this paper? And possibly (or aletrnatively) its companion paper #77 ?
Let me know!

charlypg · 2023-12-05T18:55:38Z

Hello everybody.
I am going to review SimCLR then BYOL. I have a lot to do for my own research during 2 weeks but I think I can do your review before the 25. Is it ok for you ?
It will also depend on the required computational resources.

benoit-girard · 2023-12-21T10:07:46Z

@bsciolla would you like to review this paper? And possibly (or alternatively) its companion paper #77 ?
Let me know!

benoit-girard · 2024-01-19T10:59:27Z

@cJarvers would you like to review this paper? And possibly (or alternatively) its companion paper #77 ?
Let me know!

benoit-girard · 2024-01-19T11:00:47Z

@schmidDan would you like to review this paper? And possibly (or alternatively) its companion paper #77 ?
Let me know!

benoit-girard · 2024-01-19T11:01:49Z

@charlypg do you have an idea when you could be able to deliver your review?

benoit-girard · 2024-01-19T11:02:28Z

@mo-arvan would you like to review this paper? And possibly (or alternatively) its companion paper #77 ?
Let me know!

benoit-girard · 2024-01-19T11:02:56Z

@pena-rodrigo would you like to review this paper? And possibly (or alternatively) its companion paper #77 ?
Let me know!

benoit-girard · 2024-01-19T11:03:26Z

@bagustris would you like to review this paper? And possibly (or alternatively) its companion paper #77 ?
Let me know!

benoit-girard · 2024-01-19T11:04:08Z

@birdortyedi would you like to review this paper? And possibly (or alternatively) its companion paper #77 ?
Let me know!

benoit-girard · 2024-01-19T11:04:44Z

@MiWeiss would you like to review this paper? And possibly (or alternatively) its companion paper #77 ?
Let me know!

MiWeiss · 2024-01-19T20:37:39Z

@MiWeiss would you like to review this paper? And possibly (or alternatively) its companion paper #77 ? Let me know!

Hi @benoit-girard. Unfortunately, I am currently not available - and I am afraid I also would not have quite the compute needed to run the code of this paper ;-)

charlypg · 2024-01-31T15:11:12Z

Hello everybody. I am really sorry for the delay.
First of all, thank you for this work that may benefit to the community because reproduction in machine learning is always complicated as tips and tricks are not always precised in the articles themselves.
Here are two lists, one for the good aspects and another one for the problems I encountered.

Good :

Implementation tips and tricks are precised in the article
The code is proper and clear which benefits to the community
I could reproduce CIFAR top-1 accuracy results
(PS: How to reproduce top-5 ?)

Problems :

Config for single GPU
- Please provide minimum config (cuda version + conda env + requirements). I had some problems because I didn't have the right cuda/cudnn version. Even though it takes only 5-10min to fix, I think it is important.
- "six" module is not in requirements
For evaluation I had the following problem :
"Error tracker : world_size missing argument for tracker". So I set it to 1.
What does world_size mean and how to set it ?
With world_size=1 I could reproduce eval results for CIFAR but not for ImageNet on Jean Zay. In the logs I obtain 59% of top-1 accuracy instead of 70% in the article and I also obtain a warning with it :
"WARNING: A reduction issue may have occurred (abs(50016.0 - 1563.0*1) >= 1)."

ADevillers · 2024-02-02T14:51:01Z

Dear Reviewer (@charlypg),

Thank you very much for your insightful feedback.

I will do my best to ensure that I provide the minimal configuration required to run the code on a single (non-JeanZay) GPU machine as soon as possible. However, I would like to highlight a challenge: currently, I do not have access to a machine with these specifications. My resources are limited to Jean Zay and a CPU-only laptop, which may complicate the development and testing of the configuration (hopefully, this will not be the case for a long time).

Regarding the "Error tracker: world_size missing argument for tracker" issue, it is my bad (and it is now fixed). This error was indeed a typo on my part, coming from recent code updates related to the warning mentioned right after in your review.

Thus, for this warning "A reduction issue may have occurred (abs(50016.0 - 1563.0*1) >= 1)," this problem is attributed to an unresolved issue within PyTorch's distributed operations that can lead to illogical reduction, leading to erroneous results (for further details, please refer to: https://discuss.pytorch.org/t/distributed-all-reduce-returns-strange-results/89248). Unfortunately, if this warning is triggered, it indicates that the results of the current epoch (often the final one) are unreliable. The recommended approach in this case is to restart the experiment from the previous checkpoint.

Regarding the top-5 accuracy metric, it should be automatically calculated and available through TensorBoard. Could you please clarify if you encountered any difficulties in accessing these results?

Best regards,
Alexandre DEVILLERS

charlypg · 2024-02-02T20:35:47Z

Dear @ADevillers ,

Thank you for your response.
I will try the evaluation on other checkpoints. By the way, what do "even" and "odd" mean regarding checkpoints ?

Thank you in advance,
Charly PECQUEUX--GUÉZÉNEC

ADevillers · 2024-02-06T08:41:27Z

Dear @charlypg,

To clarify this part of the checkpointing strategy, this involves alternating saves between "odd" and "even" checkpoints at the end of each respective epoch. This trick ensures that if a run fails during an odd-numbered epoch, we have the state from the preceding epoch in the "even" checkpoint, and vice versa.

Please feel free to reach out if you have any further questions.

Best regards,
Alexandre

benoit-girard · 2024-02-07T14:56:38Z

@charlypg : thanks a lot for the review.

benoit-girard · 2024-02-07T14:57:39Z

@MiWeiss would you like to review this paper? And possibly (or alternatively) its companion paper #77 ? Let me know!

Hi @benoit-girard. Unfortunately, I am currently not available - and I am afraid I also would not have quite the compute needed to run the code of this paper ;-)

Thanks a lot for your answer.

benoit-girard · 2024-02-07T15:00:27Z

@ReScience/reviewers I am looking for a reviewer with expertise in machine learning to review this submission and possibly (or alternatively) its companion paper #77

charlypg · 2024-02-07T15:06:04Z

Dear @ADevillers ,

Thank you for your answer.

I have a question about the training. Once the job corresponding to "run_simclr_imagenet.slurm" has successfully ended, I only obtain one checkpoint of the form "expe_[job_id]_[epoch_number].pt".
If I understand your paper well ("Jobs too long and checkpoints" paragraph), you submit the same slurm multiple times to reach 800 epochs ?
If yes, is the checkpoint, from which you start training, the only thing you modify in the slurm script ?

Best regards,
Charly PECQUEUX--GUÉZÉNEC

ADevillers · 2024-02-14T09:55:44Z

Dear @charlypg ,

Yes, the script itself remains unchanged; the only variation is in the checkpoint used. Initially, no checkpoint is provided for the first execution. Then, I use the last checkpoint from the preceding job. This checkpoint contains all pertinent data, including the current epoch, scheduler, optimizer, and model state, allowing the training to resume from where it was interrupted. Note that you should not modify the other hyperparameters while doing so, as this may lead to unexpected behaviors.

Best regards,
Alexandre

charlypg · 2024-04-03T13:52:10Z

Dear @ADevillers ,

I am sorry for my late response.

I could reproduce top-1 results on Jean Zay. So the reproduction seems convincing to me.

However, I cannot find the top-5 results. I saw there is a folder "runs" but much of my evaluation results have not been stored in it.

Best regards,
Charly PECQUEUX--GUÉZÉNEC

ADevillers · 2024-04-20T12:53:48Z

Dear @charlypg,

Your runs should normally be stored in the "runs" folder under a format readable by tensorboard and contains all the curves (including Top-5 acc).

Note that, when starting from a checkpoint, the data will append to the file corresponding to the run of the checkpoint. Therefore, a run on ImageNet, even if it requires 6 to 7 restarts from a checkpoint, will only produce one file (that will contain everything).

To find out where the issue could be, can you please answer the following questions:

Is your "runs" folder empty?
Have you been able to open tensorboard with the "runs" folder?
If so, do you see any runs/curves?
Are you able to find in the runs list the ones starting with the same ID as the first job of your run?
If so, is there any curve you are able to see for these runs?

Best,
Alexandre DEVILLERS

rougier · 2024-05-27T12:49:34Z

@benoit-girard Gentle reminder

rougier · 2024-07-11T05:30:19Z

@benoit-girard Any update on the second review ?

rougier added DOM: Machine Learning LANG: Python TYPE: Replication 01 - Request labels Nov 22, 2023

benoit-girard self-assigned this Nov 22, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Re] A Simple Framework for Contrastive Learning of Visual Representations #76

[Re] A Simple Framework for Contrastive Learning of Visual Representations #76

ADevillers commented Nov 9, 2023

rougier commented Nov 22, 2023

rougier commented Nov 22, 2023

benoit-girard commented Nov 22, 2023

benoit-girard commented Dec 4, 2023

benoit-girard commented Dec 4, 2023

charlypg commented Dec 5, 2023

benoit-girard commented Dec 21, 2023 •

edited

Loading

benoit-girard commented Jan 19, 2024 •

edited

Loading

benoit-girard commented Jan 19, 2024

benoit-girard commented Jan 19, 2024

benoit-girard commented Jan 19, 2024

benoit-girard commented Jan 19, 2024

benoit-girard commented Jan 19, 2024

benoit-girard commented Jan 19, 2024

benoit-girard commented Jan 19, 2024

MiWeiss commented Jan 19, 2024

charlypg commented Jan 31, 2024

ADevillers commented Feb 2, 2024 •

edited

Loading

charlypg commented Feb 2, 2024

ADevillers commented Feb 6, 2024

benoit-girard commented Feb 7, 2024

benoit-girard commented Feb 7, 2024

benoit-girard commented Feb 7, 2024

charlypg commented Feb 7, 2024

ADevillers commented Feb 14, 2024

charlypg commented Apr 3, 2024

ADevillers commented Apr 20, 2024 •

edited

Loading

rougier commented May 27, 2024

rougier commented Jul 11, 2024

[Re] A Simple Framework for Contrastive Learning of Visual Representations #76

[Re] A Simple Framework for Contrastive Learning of Visual Representations #76

Comments

ADevillers commented Nov 9, 2023

rougier commented Nov 22, 2023

rougier commented Nov 22, 2023

benoit-girard commented Nov 22, 2023

benoit-girard commented Dec 4, 2023

benoit-girard commented Dec 4, 2023

charlypg commented Dec 5, 2023

benoit-girard commented Dec 21, 2023 • edited Loading

benoit-girard commented Jan 19, 2024 • edited Loading

benoit-girard commented Jan 19, 2024

benoit-girard commented Jan 19, 2024

benoit-girard commented Jan 19, 2024

benoit-girard commented Jan 19, 2024

benoit-girard commented Jan 19, 2024

benoit-girard commented Jan 19, 2024

benoit-girard commented Jan 19, 2024

MiWeiss commented Jan 19, 2024

charlypg commented Jan 31, 2024

ADevillers commented Feb 2, 2024 • edited Loading

charlypg commented Feb 2, 2024

ADevillers commented Feb 6, 2024

benoit-girard commented Feb 7, 2024

benoit-girard commented Feb 7, 2024

benoit-girard commented Feb 7, 2024

charlypg commented Feb 7, 2024

ADevillers commented Feb 14, 2024

charlypg commented Apr 3, 2024

ADevillers commented Apr 20, 2024 • edited Loading

rougier commented May 27, 2024

rougier commented Jul 11, 2024

benoit-girard commented Dec 21, 2023 •

edited

Loading

benoit-girard commented Jan 19, 2024 •

edited

Loading

ADevillers commented Feb 2, 2024 •

edited

Loading

ADevillers commented Apr 20, 2024 •

edited

Loading