Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add mode='fan_geom_avg' to nn.initializers.variance_scaling #25649

Open
carlosgmartin opened this issue Dec 20, 2024 · 0 comments
Open

Add mode='fan_geom_avg' to nn.initializers.variance_scaling #25649

carlosgmartin opened this issue Dec 20, 2024 · 0 comments
Labels
enhancement New feature or request

Comments

@carlosgmartin
Copy link
Contributor

Feature request: Add 'fan_geom_avg' as an option for the mode argument of nn.initializers.variance_scaling, in order to use the geometric mean rather than arithmetic mean of fan_in and fan_out for the denominator.

Beyond Folklore: A Scaling Calculus for the Design and Initialization of ReLU Networks:

This scaling calculus results in a number of consequences, among them the fact that the geometric mean of the fan-in and fan-out, rather than the fan-in, fan-out, or arithmetic mean, should be used for the initialization of the variance of weights in a neural network.

Initialization using the geometric-mean of the fan-in and fan-out ensures a constant layer scaling factor throughout the network, aiding optimization.

The use of geometric initialization results in an equally weighted diagonal, in contrast to the other initializations considered.

SplitNets: Designing Neural Architectures for Efficient Distributed Computing on Head-Mounted Systems:

Using geometric average allows us to find a better compromise between forward and backward passes and significantly improve training stability and final accuracy

Our split-aware initialization adopts geometric average instead of arithmetic average to make a better balance between forward and backward

I can submit a PR for this.

@carlosgmartin carlosgmartin added the enhancement New feature or request label Dec 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant