Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Horizontal fusion without using concats #19877

Open
5 tasks
MaheshRavishankar opened this issue Feb 2, 2025 · 1 comment
Open
5 tasks

Horizontal fusion without using concats #19877

MaheshRavishankar opened this issue Feb 2, 2025 · 1 comment
Assignees

Comments

@MaheshRavishankar
Copy link
Contributor

MaheshRavishankar commented Feb 2, 2025

Current implementation of horizontal fusion of contractions concats the original operands to create a contraction op. This adds a lot of transient memory foot print overhead (and high initialization copy costs). It also prevents horizontal fusion from kicking in some cases. This is a tracking issue for revamping horizontal fusion to address these issues.

https://github.com/MaheshRavishankar/iree/tree/noconcat_horizontal_fusion_e2e is the dev branch for these changes. This is a tracking issue for changes required to make these work in general.

For now these are the examples being used for development purposes

test1.mlir
test2.mlir
test3.mlir
test4.mlir

These can be compiled using https://github.com/MaheshRavishankar/iree/tree/noconcat_horizontal_fusion_e2e

iree-compile <input file> --iree-hal-target-device=hip --iree-hip-target=gfx942 --iree-dispatch-creation-enable-aggressive-fusion=true --iree-dispatch-creation-enable-fuse-horizontal-contractions=true -o <output vmfb>

Here are some tasks that need to be addressed (and tentative assignees)

@MaheshRavishankar MaheshRavishankar self-assigned this Feb 2, 2025
@MaheshRavishankar
Copy link
Contributor Author

MaheshRavishankar commented Feb 5, 2025

I needed to go a bit further on this path to see any actual benefit. So needed to make some changes to handle something of this form

test5.mlir

I got it through till vector distribute pass, but it fails with this input.
@Groverkss could you take a look

iree-opt --pass-pipeline=builtin.module(func.func(iree-llvmgpu-vector-distribute)) vector_distribute_repro.mlir

vector_distribute_repro.mlir:1:1: error: 'func.func' op failed to distribute

func.func @horizontal_fusion_transpose_v_dispatch_0_generic_2x10x4096x64x640_i8xi8xi8xi8xi32xi32xi32() attributes{
    translation_info = #iree_codegen.translation_info<pipeline = LLVMGPUVectorDistribute workgroup_size = [256, 1, 1] subgroup_size = 64, {}>} {              ^                       

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant