Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

parameter update #274

Open
nickhalmagyi opened this issue Oct 5, 2024 · 0 comments
Open

parameter update #274

nickhalmagyi opened this issue Oct 5, 2024 · 0 comments

Comments

@nickhalmagyi
Copy link

nickhalmagyi commented Oct 5, 2024

I have a question about how the parameter updates take place. As described in

https://kfac-jax.readthedocs.io/en/latest/overview.html#optimizer

and

https://kfac-jax.readthedocs.io/en/latest/overview.html#automatic-selection-of-update-coefficients

with reference to the KFAC paper, the parameters $\alpha$ and $\beta$ are computed from a local quadratic model.

If I call the damped curvature matrix $\hat{C} = C + (\lambda + \eta)I$ , then I find that with

$$\begin{aligned} g&= \nabla L \\ \delta &= \alpha \hat{C}^{-1} g + \beta v \end{aligned}$$

the partial derivatives of the quadratic model are

$$\begin{aligned} \partial_\alpha q(\delta) &= (1+\alpha) g^T \hat{C}^{-1} g +\beta g^T v \\ \partial_\beta q(\delta) &= (1+\alpha) g^T v + \beta v^T \hat{C}v \end{aligned}$$

which are set to zero by

$$(\alpha, \beta)=(-1,0)$$

Unless I am mistaken, this is similar to the Newton method for the quadratic model being exact in one step. There is no need for momentum, it will not improve the Newton method for the quadratic model.

I see the comment the $C$ may be the exact or approximate Fisher matrix, which would alter the calculation but is it correct that in principle with the same $C$ being used for $\delta$ and $q$ that the quadratic model is solved trivially as above?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant