Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MarkAliasesPrepare to recognize meta ops with DID loop split. #3902

Open
wujingyue opened this issue Feb 15, 2025 · 2 comments
Open

MarkAliasesPrepare to recognize meta ops with DID loop split. #3902

wujingyue opened this issue Feb 15, 2025 · 2 comments
Labels
allocation domain issues related to allocation domain support Multi-GPU

Comments

@wujingyue
Copy link
Collaborator

wujingyue commented Feb 15, 2025

See https://github.com/NVIDIA/Fuser/tree/bug3902 for a repro.

The root cause is having too many assumptions on allocation being a permutation of logical, e.g.,

if (!ir_utils::computePermutation(

cc @jjsjann123: have you thought about this? Wonder whether/how ID model can play a role here.

@wujingyue wujingyue added allocation domain issues related to allocation domain support Multi-GPU labels Feb 15, 2025
@jjsjann123
Copy link
Collaborator

very roughly looking at the analysis,

In function AliasFinder::mapInLayoutToOutRoot

  std::unordered_map<IterDomain*, IterDomain*> in_logical_to_out_root =
      PairwiseLogicalDomainMap(in, out).mapProducerToConsumer();

Instead of using the PairwiseLogicalDomainMap, If we switch to id_model, we can try to map it with ExactGraph.

So inside Layout, instead of storing std::vector<IterDomain*>, I think we can do std::vector<ValGroup>?

So we still won't be able to support transforms, but for the repro case, where the meta operation just preserves the allocation domain from producer, we should be able to identify them as the same.

@jjsjann123
Copy link
Collaborator

jjsjann123 commented Feb 18, 2025

So we still won't be able to support transforms

Even for transforms, if id_model can figure out the equivalence among IterDomain through transforms, we would benefit from that as well I guess?

For TV's without an allocation domain, we'll need to figure out if/how to replay the transformation. I don't know if we already have utility that does that. But seems like something worth exploring.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
allocation domain issues related to allocation domain support Multi-GPU
Projects
None yet
Development

When branches are created from issues, their pull requests are automatically linked.

2 participants