-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Re] When Does Label Smoothing Help? #75
Comments
Thanks for your submission, we'll assign an editor soon. |
@koustuvsinha @gdetor Can any of you two edit this submission in machine learning? |
@rougier I can handle this. |
@gdetor Thanks! |
@gdetor thank you for agreeing to handle this submission! Is there anything we can do to move this submission forward? |
Any update? |
Dear reviewers @ReScience/reviewers Could anybody review this submission? |
I'd be interested in reviewing this submission, but I have to mention, I doubt I can rerun all the experiments due to computational constraints. |
Okay, I will review this work by the end of July. |
I apologize, but I have not been able to review this submission yet, should be able to write the review within the next few weeks. |
Thanks. Any progress? |
@mo-arvan gentle reminder |
In this paper, Wagner et al. provide a reproduction report of Müller et al.'s work on label smoothing. They begin with a concise introduction to the original study and the motivations behind it. The authors then present essential details regarding the models and datasets used, noting specific variations driven by limited computational resources. The authors have done an excellent job of providing documentation and instructions for using their released code. Their repository includes multiple Jupyter notebooks detailing the conducted experiments, along with specified dependency requirements to facilitate the setup process. To further simplify future installations, I created a Docker container as part of the review process. The files and instructions are available in my forked repository. In their initial results, the authors examine the effect of label smoothing on model accuracy. While Müller et al. claimed that label smoothing positively impacts the test accuracy of trained models, Wagner et al. suggest that it enhances accuracy by reducing overfitting—a claim not made by the original authors. However, their results indicate mixed effects; out of eight experiments, three showed higher accuracy without label smoothing. Upon reviewing their code (https://github.com/sdwagner/re-labelsmoothing/blob/fb6c3634d2049ef7f175e7a992f109c43680fae3/datasets/datasets.py), it appears that they do not load the test set, raising the possibility that the reported results are based on the validation set. Unlike the original study, this reproduction does not include confidence intervals, and the small differences in accuracy could be attributed to randomness in the training process. Adding uncertainty analysis would significantly strengthen this work. In the next section, the authors reproduce the results of a visualization experiment from the original study that demonstrates the effect of label smoothing on the activations of the penultimate layer and the network output. Figure 2 in their work aligns with the findings of the original study, although there is a minor discrepancy in the order of the columns in the visualization. The authors then investigate the impact of label smoothing on Expected Calibration Error (ECE). With the exception of the results from the MNIST dataset using a fully connected network, their findings generally align with those of the original study. The reported results for training a transformer model for translation are mixed, with not all findings matching the original study. Similar to the accuracy results, the authors report findings based on the validation set, which may account for some discrepancies. Finally, the results of the distillation experiments on fully connected networks for MNIST are consistent with the original study, though there is a slight increase in error. Ultimately, the authors confirm the observation made by Müller et al. regarding accuracy degradation in students when the teacher is trained with label smoothing. Figure 7 and 8 lack the confidence intervals present in the original study, which would have been beneficial for comparison. Minor editing suggestions: |
@mo-arvan Thanks for creating a dockerfile! Feel free to open a PR to integrate it into our repository 😊 |
Glad you find it useful. Sure, I'll submit a pull request. I'd be happy to engage in a discussion as well. One last minor comment, your use of vector graphics in your figures is a step up from the original publication, I'd suggest changing the color palette and the patterns to further improve the presentation of the figures, e.g. Figure 3 (b). |
@mo-arvan Thanks again for your detailed comments! In the following we want to address each of the points that you raised:
|
@mo-arvan Please let me know if you agree with the responses so I can publish the paper. Thank you. |
Yes, the response addresses my main concerns. I was wrong about the validation/test splits. |
@gdetor Can you kindly let us know what the next steps are? Is there anything required from our side? Thank you! |
Original article: Rafael Müller, Simon Kornblith, and Geoffrey E. Hinton. "When does label smoothing help?." Advances in neural information processing systems 32 (2019). (https://arxiv.org/pdf/1906.02629.pdf)
PDF URL: https://github.com/sdwagner/re-labelsmoothing/blob/main/report/article.pdf
Metadata URL: https://github.com/sdwagner/re-labelsmoothing/blob/main/report/metadata.yaml
Code URL: https://github.com/sdwagner/re-labelsmoothing
Scientific domain: Machine Learning
Programming language: Python
Suggested editor: Georgios Detorakis or Koustuv Sinha
The text was updated successfully, but these errors were encountered: