[Dispatch] Make dynamic attention ineligible for collapse #19929

IanWood1 · 2025-02-06T17:16:42Z

For the same reasons we can't collapse dynamic linalg.generic ops, we also cannot collapse attention ops with dynamic dimensions. This moves the check for dynamic dims before checking for LinalgExt::AttentionOp. Saw causing collapse/expand ops in dispatches in #19921

MaheshRavishankar · 2025-02-06T23:57:48Z

For the same reasons we can't collapse dynamic linalg.generic ops, we also cannot collapse attention ops with dynamic dimensions. This moves the check for dynamic dims before checking for LinalgExt::AttentionOp. Saw causing collapse/expand ops in dispatches in #19921

why cant we collapse dynamic dimensions in attention op (cause I cant immediately remember why we cant collapse dynamic dimensions of linalg.generic. The only reason I can think of was that tensor.expand_shape didnt support dynamic dimensions, but now it does).

IanWood1 · 2025-02-07T01:28:11Z

For the same reasons we can't collapse dynamic linalg.generic ops, we also cannot collapse attention ops with dynamic dimensions. This moves the check for dynamic dims before checking for LinalgExt::AttentionOp. Saw causing collapse/expand ops in dispatches in #19921

why cant we collapse dynamic dimensions in attention op (cause I cant immediately remember why we cant collapse dynamic dimensions of linalg.generic. The only reason I can think of was that tensor.expand_shape didnt support dynamic dimensions, but now it does).

There are some changes to the pass needed (see #19654). But in this case this might be overkill since the actual collapsed tensor is fully static even though the attention op has dynamic dims. The attention pattern just bails when any of the loops are dynamic.

MaheshRavishankar · 2025-02-07T02:08:28Z

How this work for LLaMa then? I thought that we previously had issues with attention op not being collapsed back down. If this doesnt cause regression on LLaMa (which isnt in CI), this looks fine to me.

Signed-off-by: Ian Wood <[email protected]>

IanWood1 · 2025-02-14T18:04:06Z

Not needed after #19654

IanWood1 requested review from hanhanW and MaheshRavishankar as code owners February 6, 2025 17:16

[Dispatch] Fix ordering so attention must be static

0b07187

Signed-off-by: Ian Wood <[email protected]>

IanWood1 closed this Feb 14, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Dispatch] Make dynamic attention ineligible for collapse #19929

[Dispatch] Make dynamic attention ineligible for collapse #19929

IanWood1 commented Feb 6, 2025

MaheshRavishankar commented Feb 6, 2025

IanWood1 commented Feb 7, 2025

MaheshRavishankar commented Feb 7, 2025

IanWood1 commented Feb 14, 2025

[Dispatch] Make dynamic attention ineligible for collapse #19929

[Dispatch] Make dynamic attention ineligible for collapse #19929

Conversation

IanWood1 commented Feb 6, 2025

MaheshRavishankar commented Feb 6, 2025

IanWood1 commented Feb 7, 2025

MaheshRavishankar commented Feb 7, 2025

IanWood1 commented Feb 14, 2025