[DRAFT] Add basic test for GQA (fusion)#2142
Conversation
❌ 7 Tests Failed:
View the top 3 failed test(s) by shortest run time
To view more test analytics, go to the Test Analytics Dashboard |
Before discussing where to put this, let me explain why I created this. I have a GQA fusion implementation, which successfully replaces a subgraph with GQA ... but it produces a different output (with and without the fusion). It is very complicated pattern, and it was not clear what exactly was wrong. An advantage of a test-case like this one is that I can execute through the "expanded subgraph" and the 'replacement subgraph' using a debugger, stopping at intermediate points, to verify values look the way I think they should look (for example: does the attention mask look like a causal mask, etc.). (All I have to do is call the script-function with the inputs, instead of using its model-proto for an ORT session.) The proof of correctness of a fusion rule comes from a proof of equivalence of code-fragments as captured in this test-case. |
To continue the discussion: once a working fusion rule is in place, this test can be easily turned into a fusion rule (by applying the fusion to the expanded model defined in the test case). |
|
Closing this. Will create an updated PR with GQA fusion added. |
Adds a test to verify equivalence of an expanded graph and GQA operator (which will serve as the basis for a fusion rule).