Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support batch renormalization #112

Open
ndryden opened this issue Sep 17, 2017 · 0 comments
Open

Support batch renormalization #112

ndryden opened this issue Sep 17, 2017 · 0 comments

Comments

@ndryden
Copy link
Collaborator

ndryden commented Sep 17, 2017

Paper: https://arxiv.org/abs/1702.03275

Batch renormalization aims to address the (fundamental) problem with batchnorm that a model is trained with activations from a different distribution than inference is done on, because per-mini-batch statistics are used during training and (approximations of) population statistics are used during inference. In particular, it helps when

  • Mini-batch sizes are small;
  • Or, samples are not IID. (This isn't really a problem for our current models.)

This is most relevant on CPU codes where we often have a quite small local mini-batch size. (Batch renorm does not seem to recover the full generalization accuracy compared to using a larger mini-batch, but it does get closer.)

oyamay pushed a commit to oyamay/lbann that referenced this issue Jun 16, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant