replace BatchNorm with LayerNorm #1035
Labels
API changes
This impacts the public API of the project (e.g. inference class).
enhancement
New feature or request
good first issue
Good for newcomers
hackathon
Is your feature request related to a problem? Please describe.
BatchNorm violates the iid (independent and identically distributed) assumption (see here, which is fundamental to many of the objective functions in this library. This can potentially cause issues when training models that rely on this assumption.
Describe the solution you'd like
To address this, we propose replacing BatchNorm with LayerNorm or GroupNorm wherever feasible. These alternatives do not interfere with the iid assumption and are better suited to the requirements of our library.
This needs to adapt all the "factories" (e.g. here:
Describe alternatives you've considered
One alternative could be to allow users to choose between BatchNorm and LayerNorm through a configurable flag. However, this seems like a niche use case, and BatchNorm’s reliance on iid assumptions doesn’t offer a compelling reason to keep it, aside from legacy compatibility.
Additional context
Many classifiers provide an option like
use_batch_norm: bool = False
to control BatchNorm usage. We suggest updating this option touse_layer_norm
to align with the new approach.The text was updated successfully, but these errors were encountered: