Adding additional learning rate scheduler #296

wiederm · 2024-10-23T18:11:25Z

Pull Request Summary

This PR will add a number of additional learning rate scheduler and infrastructure that is needed to pass control parameters. Additionally, this PR will add a loss scaling scheduler to dynamically control the weight of each loss component as a function for the current epoch index.

Learning rate scheduling

The default learning rate scheduler used a step function reducing the learning rate whenever a monitored property was not improving for a given number of optimization steps, an alternative is something like the CosineAnnealing learning rate scheduler (and it's variety with warmup/restart) that anneales the learning rate from a starting value to a target value over a specified number of epochs.
Other provided LR scheduler are the OneCycle and Cyclic learning rate scheduler, see the PyTorch documentation for their exact behvior.

Loss component scaling

To prioritize different learning tasks in multi-objective learning runs this PR introduces a linear scaling to each component that can be optionally activated using the keywords target_weights and mixing_step for each component name. This results in a scaling of the component from the weight to the target_weight value using mixing_step as the step size (Note that the sign has to be changed depending on the singe of the slope).

In the training run shown below the force component loss is scaled using from an initial weight of 0.8 to 0.2 using -0.1 step size, and then the training is continued with the target weight:

Key changes

Notable points that this PR has either accomplished or will accomplish.

Associated Issue(s)

issue Unsstable multitask learning behavior #271

Pull Request Checklist

Issue(s) raised/addressed and linked
Includes appropriate unit test(s)
Appropriate docstring(s) added/updated
Appropriate .rst doc file(s) added/updated
PR is ready for review

- adding a scheduler for the loss components (will allow us to change the scaling of the components as a factor of epoch number)

codecov-commenter · 2024-10-24T14:14:13Z

Codecov Report

Attention: Patch coverage is 93.64162% with 11 lines in your changes missing coverage. Please review.

Project coverage is 85.35%. Comparing base (45be449) to head (87c4d3c).
Report is 19 commits behind head on main.

Additional details and impacted files

- adding different learning rate scheduler (and their parameter passing)

5244e0a

- adding a scheduler for the loss components (will allow us to change the scaling of the components as a factor of epoch number)

wiederm self-assigned this Oct 23, 2024

wiederm and others added 9 commits October 23, 2024 19:11

Merge branch 'main' into dev-training-scheduler

b3d983b

update LR scheduler infrastructure

9fbafbc

add ONeCycle and Cyclig LR configs

d58d0b5

working around the increased complexity of the cycle scheduler

f605e48

fix test

43705b5

update single tomls, allow for scaling of selected components

7f76f66

fixing tests

9ff635a

fix tests

1096757

fix tests

49a1c1e

wiederm requested a review from chrisiacovella October 24, 2024 13:46

wiederm added the enhancement New feature or request label Oct 24, 2024

wiederm and others added 8 commits October 24, 2024 21:39

make sure interval is set to step

ca6cd44

address warning and linter

88c220f

please the linter

d408335

remove unused variables

5eebfa5

update weight_decay

479da9b

Merge branch 'main' into dev-training-scheduler

35638e4

Merge branch 'main' into dev-training-scheduler

414f3b2

Merge branch 'main' into dev-training-scheduler

87c4d3c

wiederm merged commit 710d1e4 into main Oct 29, 2024
6 checks passed

wiederm deleted the dev-training-scheduler branch October 29, 2024 09:08

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adding additional learning rate scheduler #296

Adding additional learning rate scheduler #296

wiederm commented Oct 23, 2024 •

edited

Loading

codecov-commenter commented Oct 24, 2024 •

edited

Loading

Adding additional learning rate scheduler #296

Adding additional learning rate scheduler #296

Conversation

wiederm commented Oct 23, 2024 • edited Loading

Pull Request Summary

Learning rate scheduling

Loss component scaling

Key changes

Associated Issue(s)

Pull Request Checklist

codecov-commenter commented Oct 24, 2024 • edited Loading

Codecov Report

wiederm commented Oct 23, 2024 •

edited

Loading

codecov-commenter commented Oct 24, 2024 •

edited

Loading