Mooncake reverse rules #85

kshyatt · 2025-10-22T18:05:03Z

No description provided.

github-actions · 2025-10-22T19:08:43Z

Your PR no longer requires formatting changes. Thank you for your contribution!

ext/MatrixAlgebraKitMooncakeExt/MatrixAlgebraKitMooncakeExt.jl

lkdvos

I left some comments throughout, I haven't looked at the tests yet though.

One thing that I am still struggling with is understanding why the adjoint functions return NoRData() for all of the input arguments. Am I correct in assuming that this only works for mutable data? Should we maybe reconsider the strictness of our rules in that case?

ext/MatrixAlgebraKitMooncakeExt/MatrixAlgebraKitMooncakeExt.jl

src/implementations/eigh.jl

src/pullbacks/eig.jl

test/ad_utils.jl

test/mooncake.jl

kshyatt · 2025-11-04T18:56:47Z

One thing that I am still struggling with is understanding why the adjoint functions return NoRData() for all of the input arguments. Am I correct in assuming that this only works for mutable data? Should we maybe reconsider the strictness of our rules in that case?

Yes, afaik it's set up this way because all the reverse-data you need should be "packaged in" to the tangents of the CoDual for arrays or tuples of arrays -- you'd return explicit RData if you have scalar arguments

kshyatt · 2025-11-05T10:35:18Z

OK, everything has been addressed except for the kwargs discussion above, I think

lkdvos

Thanks for so quickly going into the comments and resolving them!

I tried going over this again, and I have another round of questions (apologies for that, I think pieces are falling together as I start understanding what is going on):

In the rrule!s, I see we are consistently copying the state of the provided memory for the output of the primal call. (arg1c = copy(arg1) etc). I think I understand that this is what is required by Mooncake, as we need to restore the state before the function call to ensure rules that need to happen earlier in the function call get the correct data.

If we have to make a copy of the current state anyways, and we aren't required by our interface to ensure that our factorizations are in-place, would it maybe be simpler to call the non-inplace version of the factorization function to begin with, and not mutate?

For example, something like this:

function Mooncake.rrule!!(::CoDual{typeof(eig_full!)}, A_dA::CoDual, DV_dDV::CoDual, alg_dalg::CoDual)
    # argument sanitizing with arrayify etc
    DV = eig_full(A, alg)
    function adjoint(::Mooncake.NoRData)
        eig_pullback!(dA, A, DV..., dD, dV; kwargs...)
        MatrixAlgebraKit.zero!.(dDV)
        return NoRData(), ...
    end
    ...
end

where we effectively say that eig_full!(A, DV, alg) = eig_full(A, alg) for AD purposes.

Along the same lines of reasoning, a different implementation style + optimization might be to actually implement rrules for the non-mutating functions first, i.e. eig_full(A, alg), and then add a rrule for eig_full!(A, DV, alg) that simply calls the eig_full rrule.
Here, the nice part is that we are not keeping copies of state that we have just allocated, and know we wont be needing later on, and additionally that we might be simplifying some of the logic.

I still have to actually look at the test functionality to see what is going on there, but one thing at a time 🙃

lkdvos · 2025-11-05T20:47:46Z

test/ad_utils.jl

Given that I have to copy this implementation in TensorKit, and failed horrendously:
https://github.com/QuantumKitHub/TensorKit.jl/blob/bbeb8e510727fd2ce77c579ae9833f5056c85cc8/test/autodiff/ad.jl#L105-L120
I think it would be nice to have this as part of the main package, and if not, to turn this into a "test module" that can be included from within downstream packages (probably mostly by adding some comments to this file to indicate that its location and contents are considered "public" and cannot be moved/removed without breaking changes)

kshyatt · 2025-11-05T22:14:26Z

In the rrule!s, I see we are consistently copying the state of the provided memory for the output of the primal call. (arg1c = copy(arg1) etc). I think I understand that this is what is required by Mooncake, as we need to restore the state before the function call to ensure rules that need to happen earlier in the function call get the correct data.

Yes, exactly

If we have to make a copy of the current state anyways, and we aren't required by our interface to ensure that our factorizations are in-place, would it maybe be simpler to call the non-inplace version of the factorization function to begin with, and not mutate?

Yes, we can do that also. However generally the preference is to implement the rule for the "lowest" method which is why I chose the inplace ones. I can do the ones that copy instead and see if we like that better?

kshyatt · 2025-11-05T22:24:08Z

FWIW for Enzyme, there is no requirement to restore the state of he variables on the reverse pass which is the other reason I did things this way -- to be able to compare the two "like to like".

kshyatt · 2025-11-10T13:14:08Z

BTW for the in place methods, I think there is no way to avoid copying A as Mooncake (understandably) wants the rule to output the correct "foward" result for A before the pullback is called

lkdvos

Can we for now comment out the 1.12 tests for mooncake so we have an idea of the tests?

src/pullbacks/eig.jl

lkdvos

Okay, let me summarize what I think for this PR specifically, also keep in mind that this is my opinion and you are allowed to just overrule that:

kwargs... in the rrule!!
I don't actually have a great solution for the fact that we don't really support keyword arguments on top of the factorizations.
I don't think the current way of just adding these as kwargs to the rrule!! works (although it doesn't hurt either), since the primal call can't actually handle any keyword arguments (e.g.

MatrixAlgebraKit.jl/src/implementations/eig.jl

Line 81 in c99084b

function eig_full!(A::AbstractMatrix, DV, alg::LAPACK_EigAlgorithm)

).
The only precedences for this we currently have are either KrylovKit, where we ended up adding a alg_rrule keyword argument to all of the primal computations that is simply ignored, which could be an empty NamedTuple in our case.
The other is PEPSKit, where I designed a hook_pullback function that collaborates with the ChainRules.jl system to insert keyword arguments into a function that are passed on to the pullback: https://github.com/QuantumKitHub/PEPSKit.jl/blob/master/src/utility/hook_pullback.jl
Importantly, we don't need to solve that for this PR, since we are more or less everywhere just using the defaults anyways, so we can tackle that later. I would therefore maybe suggest to remove the keywords for now, since adding that would be non-breaking so that's easier to do in a follow up. (I'm fine leaving them as is too, it shouldn't matter since these shouldnt be able to ever be generated anyways)
Did we decide what to do with the @is_primitive directions?
I remember talking about them being more or less specific, but can't remember what we decided on. What is worrying me slightly is that we are clearly assuming everywhere that the inputs and outputs are mutable, which I guess doesn't always have to be the case. In the interest of moving things forwards, I guess the easiest solution here is to just restrict these definitions more aggressively, claiming that the AD engine should then figure out how to handle other cases and start adding cases back in, rather than the other direction?
There's a little bit of leftover commented out code in the tests, which could do with either removing or perhaps some explanatory comments about why they are there/why they are commented out
Overall it might be slightly nicer to have some additional comments to explain what is going on, as it really takes a bit of time to understand all of this. I'm not saying this has to be done neither by you nor now, just a general remark that this could actually benefit the overall progress here a bit.

ext/MatrixAlgebraKitMooncakeExt/MatrixAlgebraKitMooncakeExt.jl

codecov · 2025-11-11T18:51:01Z

Codecov Report

❌ Patch coverage is 99.02439% with 2 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
...gebraKitMooncakeExt/MatrixAlgebraKitMooncakeExt.jl	98.96%	2 Missing ⚠️

Files with missing lines	Coverage Δ
src/pullbacks/polar.jl	`100.00% <100.00%> (ø)`
src/pullbacks/svd.jl	`96.26% <100.00%> (ø)`
...gebraKitMooncakeExt/MatrixAlgebraKitMooncakeExt.jl	`98.96% <98.96%> (ø)`

... and 1 file with indirect coverage changes

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

ext/MatrixAlgebraKitMooncakeExt/MatrixAlgebraKitMooncakeExt.jl

test/runtests.jl

ext/MatrixAlgebraKitMooncakeExt/MatrixAlgebraKitMooncakeExt.jl

test/ad_utils.jl

test/mooncake.jl

Jutho · 2025-11-13T18:04:40Z

Ok, I think this is the last set of comments and questions from my side.

test/mooncake.jl

Jutho · 2025-11-14T09:37:48Z

Ok, my approval still stands, so if @lkdvos agrees this can be merged.

lkdvos

Looks great, thanks a lot for the hard work (and the hard patience!).

I only have some cosmetic comments left, you can choose to ignore this if you like but I think it keeps everything somewhat in the same style.
These should otherwise not make any difference though:

I think you imported NoRData but then qualify Mooncake.NoRData everywhere. Since we are already using CoDual everywhere and this is a Mooncake extension, it might be fine to simply use NoRData() everywhere to lighten the notation a bit.
In the @eval for loops there is the combination of Symbols and Function objects. I have absolutely no clue about any performance differences, or burdens on the compiler, but I think in most other parts of the ecosystem we consistently use Symbols, and I think here it wouldn't make a difference either? Correct me if I'm wrong though, no need to spend a whole lot of effort here.
For the adjoint function names, I don't think the extra d* at the front is totally necessary, qr_adjoint looks maybe a little cleaner? I checked the Mooncake rrules to see if they have any consistent conventions, and from what I can tell here there seems to be a combination of f_adjoint and f_pb!!. I don't have any strong preference, so maybe f_adjoint?

Otherwise definitely a great PR that we will benefit from greatly :)

kshyatt · 2025-11-14T13:45:02Z

Thanks! There are a few other small cosmetic improvements I'll make with imports as well.

* Bump v0.6 * rename `gaugefix` -> `fixgauge` * reduce unnecessary warnings * fix `copy_input` signatures in Mooncake tests * Add changelog to docs

kshyatt requested review from Jutho and lkdvos October 22, 2025 18:05

kshyatt force-pushed the ksh/mooncake branch 2 times, most recently from 3af6855 to c665fc4 Compare October 24, 2025 12:26

kshyatt commented Oct 27, 2025

View reviewed changes

ext/MatrixAlgebraKitMooncakeExt/MatrixAlgebraKitMooncakeExt.jl Outdated Show resolved Hide resolved

kshyatt commented Oct 27, 2025

View reviewed changes

ext/MatrixAlgebraKitMooncakeExt/MatrixAlgebraKitMooncakeExt.jl Outdated Show resolved Hide resolved

kshyatt force-pushed the ksh/mooncake branch 4 times, most recently from 743950b to f82d3b7 Compare November 4, 2025 11:07

lkdvos reviewed Nov 4, 2025

View reviewed changes

lkdvos reviewed Nov 5, 2025

View reviewed changes

kshyatt force-pushed the ksh/mooncake branch from 52c9341 to 2234edc Compare November 7, 2025 16:05

lkdvos reviewed Nov 10, 2025

View reviewed changes

src/pullbacks/eig.jl Outdated Show resolved Hide resolved

lkdvos reviewed Nov 10, 2025

View reviewed changes

ext/MatrixAlgebraKitMooncakeExt/MatrixAlgebraKitMooncakeExt.jl Outdated Show resolved Hide resolved

kshyatt force-pushed the ksh/mooncake branch from fcb67e7 to e90e842 Compare November 11, 2025 19:12

lkdvos reviewed Nov 11, 2025

View reviewed changes

Jutho reviewed Nov 12, 2025

View reviewed changes

ext/MatrixAlgebraKitMooncakeExt/MatrixAlgebraKitMooncakeExt.jl Show resolved Hide resolved

Jutho reviewed Nov 12, 2025

View reviewed changes

ext/MatrixAlgebraKitMooncakeExt/MatrixAlgebraKitMooncakeExt.jl Outdated Show resolved Hide resolved

Jutho reviewed Nov 12, 2025

View reviewed changes

ext/MatrixAlgebraKitMooncakeExt/MatrixAlgebraKitMooncakeExt.jl Outdated Show resolved Hide resolved

Jutho reviewed Nov 12, 2025

View reviewed changes

ext/MatrixAlgebraKitMooncakeExt/MatrixAlgebraKitMooncakeExt.jl Outdated Show resolved Hide resolved

Jutho reviewed Nov 12, 2025

View reviewed changes

ext/MatrixAlgebraKitMooncakeExt/MatrixAlgebraKitMooncakeExt.jl Outdated Show resolved Hide resolved

Jutho reviewed Nov 12, 2025

View reviewed changes

ext/MatrixAlgebraKitMooncakeExt/MatrixAlgebraKitMooncakeExt.jl Outdated Show resolved Hide resolved