Change Log
Feature
- Implement
FOCUS
optimizer. (#330, #331) - Implement
PSGD Kron
optimizer. (#336, #337) - Implement
EXAdam
optimizer. (#338, #339)
Update
- Support
OrthoGrad
variant toRanger25
. (#332)Ranger25
optimizer is my experimental-crafted optimizer, which mixes lots of optimizer variants such asADOPT
+AdEMAMix
+Cautious
+StableAdamW
+Adam-Atan2
+OrthoGrad
.
Fix
- Add the missing
state
property inOrthoGrad
optimizer. (#326, #327) - Add the missing
state_dict
, andload_state_dict
methods toTRAC
andOrthoGrad
optimizers. (#332) - Skip when the gradient is sparse in
OrthoGrad
optimizer. (#332) - Support alternative precision training in
SOAP
optimizer. (#333) - Store SOAP condition matrices as the dtype of their parameters. (#335)
Contributions
thanks to @Vectorrent, @kylevedder