Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pass to block dynamic dimensions of operands of iree_linalg_ext.attention. #18874

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Commits on Oct 23, 2024

  1. Allow dynamic dimensions during folding of `tensor.expand_shape/colla…

    …pse_shape` into `flow.dispatch.tensor.load/store`.
    
    This also cleans up the implementation of these patterns to avoid
    using templated code that is hard to read/maintain.
    
    Signed-off-by: MaheshRavishankar <[email protected]>
    MaheshRavishankar committed Oct 23, 2024
    Configuration menu
    Copy the full SHA
    2d64ab1 View commit details
    Browse the repository at this point in the history
  2. Pass to block dynamic dimensions of operands of `iree_linalg_ext.atte…

    …ntion`.
    
    The use of `IntegerRangeAnalysis` and `IntegerDivisibilityAnalysis`
    gives range and divisibility information for constants passed to the
    dispatch. This can be used to infer the range and divisibility
    information for all tensor values in the dispatch. This PR adds an
    analysis to do this.
    
    This analysis is then used to expand the dimensions of operands of the
    attention operation that are dynamic, but are known to be divisible by
    a compile-time static value. This gets the operations into a form that
    can be compiled by the AMDGPU backend and target the mfma intrinsics.
    
    Signed-off-by: MaheshRavishankar <[email protected]>
    MaheshRavishankar committed Oct 23, 2024
    Configuration menu
    Copy the full SHA
    949f383 View commit details
    Browse the repository at this point in the history