Factor Expr operand replacement in host IR lowering #5539

wujingyue · 2025-11-15T08:07:59Z

Add cloneWithNewOperands so #5536 can reuse a single path for replacing Expr inputs/outputs based on a Val replacement map.

github-actions · 2025-11-15T08:08:44Z

Description

Added cloneWithNewOperands to handle Expr operand replacement
Utilized cloneWithNewOperands in lowerSegment to simplify logic
Avoided redundant IR node creation when operands unchanged

Changes walkthrough

Relevant files

Enhancement

lowering.cpp `Refactor operand replacement with helper function` csrc/host_ir/lowering.cpp Introduced `cloneWithNewOperands` function Replaced manual operand replacement in `lowerSegment` with `cloneWithNewOperands` Optimized to skip reconstructing Exprs with unchanged operands	+31/-18

PR Reviewer Guide

Here are some key observations to aid the review process:

🧪 No relevant tests

⚡ Recommended focus areas for review

Performance Impact

The new function cloneWithNewOperands and its usage in lowerSegment should be evaluated for performance impact, especially in scenarios where no replacements are made.

// Returns a new Expr with the inputs and outputs replaced by the replacement
// map. If none of the inputs or outputs are replaced, returns the original
// Expr.
Expr* cloneWithNewOperands(
    Expr* e,
    const std::unordered_map<Val*, Val*>& replacement_map) {
  auto maybe_replace = [&](Val*& x) -> bool {
    Val* new_x = getOrDefault(replacement_map, x);
    if (new_x == nullptr) {
      return false;
    }
    x = new_x;
    return true;
  };

  int64_t replaced = 0;

  std::vector<Val*> new_ins = e->inputs();
  replaced += std::ranges::count_if(new_ins, maybe_replace);

  std::vector<Val*> new_outs = e->outputs();
  replaced += std::ranges::count_if(new_outs, maybe_replace);

  if (replaced == 0) {
    return e;
  }

  return e->newObjectFunc()(e->container(), new_ins, new_outs, e->attributes());
}

Code Clarity

The use of std::ranges::count_if and lambda functions might reduce code readability. Consider adding comments or breaking down the logic into smaller, more understandable functions.

replaced += std::ranges::count_if(new_ins, maybe_replace);

std::vector<Val*> new_outs = e->outputs();
replaced += std::ranges::count_if(new_outs, maybe_replace);

Error Handling

Ensure that e->newObjectFunc() handles all possible cases and does not throw exceptions or return null, which could lead to undefined behavior.

  return e->newObjectFunc()(e->container(), new_ins, new_outs, e->attributes());
}

wujingyue · 2025-11-15T08:10:09Z

csrc/host_ir/lowering.cpp

+Expr* cloneWithNewOperands(
+    Expr* e,
+    const std::unordered_map<Val*, Val*>& replacement_map) {
+  auto maybe_replace = [&](Val*& x) -> bool {


I also tried to use OptOutMutator (#5538) but ran into

[ RUN ] StreamTest.Matmul unknown file: Failure C++ exception with description " INTERNAL ASSERT FAILED at /opt/pytorch/nvfuser/csrc/mutator.cpp:90, please report a bug with repro script to NVFuser at https://github.com/NVIDIA/Fuser/issues. Expected !DependencyCheck::isDependencyOf(val, mutation) . Attempted to replace a val, T1_g_float[istreamIdx7{3}, iS11{i2}, iS8{( ceilDiv(i4, 3) )}], with a dependent val, T3_l_float[istreamIdx14{3}, iS12{i2}, iS15{( ceilDiv(i4, 3) )}] (T3_l_float[istreamIdx14{3}, iS12{i2}, iS15{( ceilDiv(i4, 3) )}]), which is not allowed as it would result in a recursive definition of T3_l_float[istreamIdx14{3}, iS12{i2}, iS15{( ceilDiv(i4, 3) )}] Exception raised from registerMutation at /opt/pytorch/nvfuser/csrc/mutator.cpp:90 (most recent call first):

Let me know if there's an easier way.

greptile-apps · 2025-11-15T08:10:36Z

Greptile Overview

Greptile Summary

Refactored host IR lowering to use a shared helper function cloneWithNewOperands for replacing Expr operands based on a replacement map. The new implementation avoids creating redundant IR nodes when no replacements occur, improving efficiency in non-sharded cases.

Key changes:

Added cloneWithNewOperands helper that takes an Expr and replacement map, returning either a new Expr with replaced operands or the original Expr if no replacements occurred
Replaced inline transformation logic in the ExprEval case with a call to the new helper
Optimization: skips calling newObjectFunc when replaced == 0, preventing unnecessary IR node creation

Confidence Score: 5/5

This PR is safe to merge - it's a clean refactoring that consolidates duplicate logic without changing behavior
The refactoring extracts common operand replacement logic into a reusable helper function with an optimization to avoid redundant IR nodes. The logic is equivalent to the original implementation (verified by analyzing both code paths), and the change is well-contained within a single file. The optimization of returning the original Expr when no replacements occur is correct since the input Expr is already a clone from ir_cloner.
No files require special attention

Important Files Changed

File Analysis

Filename	Score	Overview
csrc/host_ir/lowering.cpp	5/5	Refactored Expr operand replacement logic into a reusable helper function that avoids redundant IR node creation when no sharding occurs

Sequence Diagram

sequenceDiagram
    participant LowerSegment as lowerSegment
    participant CloneHelper as cloneWithNewOperands
    participant ReplacementMap as replacement_map
    participant Expr as Expression
    participant NewObjectFunc as newObjectFunc

    LowerSegment->>LowerSegment: Build replacement_map<br/>(sharded TensorViews)
    loop For each Expr in segment
        LowerSegment->>CloneHelper: cloneWithNewOperands(e, replacement_map)
        CloneHelper->>CloneHelper: Copy inputs vector
        CloneHelper->>CloneHelper: Apply maybe_replace to inputs<br/>(count replacements)
        CloneHelper->>CloneHelper: Copy outputs vector
        CloneHelper->>CloneHelper: Apply maybe_replace to outputs<br/>(count replacements)
        alt replaced > 0
            CloneHelper->>NewObjectFunc: Create new Expr with<br/>replaced inputs/outputs
            NewObjectFunc-->>CloneHelper: Return new Expr
            CloneHelper-->>LowerSegment: Return new Expr
        else replaced == 0
            CloneHelper-->>LowerSegment: Return original Expr<br/>(avoid redundant IR node)
        end
        LowerSegment->>LowerSegment: Add Expr to loop body
    end

greptile-apps

_{1 file reviewed, no comments}

_{Edit Code Review Agent Settings | Greptile}

wujingyue · 2025-11-15T22:54:46Z

!test

Priya2698

Can we add a test here to track any unintended changes in behavior by later modifications?

wujingyue · 2025-11-17T08:16:04Z

Can we add a test here to track any unintended changes in behavior by later modifications?

What do you mean? This change doesn't affect behavior AFAIK. cc @Priya2698

Priya2698 · 2025-11-17T08:51:37Z

Can we add a test here to track any unintended changes in behavior by later modifications?

What do you mean? This change doesn't affect behavior AFAIK. cc @Priya2698

The purpose of the PR is to not have redundant IR nodes, so we should have a different fusion IR before and after this PR. I am indicating that this behavior, that is, the updated fusion IR can be tested. Is this not the case?

wujingyue · 2025-11-17T16:04:07Z

The purpose of the PR is to not have redundant IR nodes, so we should have a different fusion IR before and after this PR.

Got it -- thanks for clarifying! I updated the PR description. The purpose is to have a common helper that both SchedulerType::ExprEval and SchedulerType::Communication can call.

Not having redundant IR nodes is a side benefit, and is an implementation detail that I'd rather not test (https://testing.googleblog.com/2013/08/testing-on-toilet-test-behavior-not.html)

cc @Priya2698

WIP

3229d3f

wujingyue requested review from Priya2698 and naoyam November 15, 2025 08:08

wujingyue commented Nov 15, 2025

View reviewed changes

greptile-apps bot reviewed Nov 15, 2025

View reviewed changes

Priya2698 reviewed Nov 17, 2025

View reviewed changes

Priya2698 approved these changes Nov 17, 2025

View reviewed changes

wujingyue merged commit 2c0687b into main Nov 17, 2025
60 checks passed

wujingyue deleted the wjy/replace branch November 17, 2025 16:10

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Factor Expr operand replacement in host IR lowering #5539

Factor Expr operand replacement in host IR lowering #5539

Uh oh!

wujingyue commented Nov 15, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Nov 15, 2025

Changes walkthrough

PR Reviewer Guide

Uh oh!

wujingyue Nov 15, 2025

Uh oh!

greptile-apps bot commented Nov 15, 2025

Uh oh!

greptile-apps bot left a comment

Uh oh!

wujingyue commented Nov 15, 2025

Uh oh!

Priya2698 left a comment

Uh oh!

wujingyue commented Nov 17, 2025

Uh oh!

Priya2698 commented Nov 17, 2025

Uh oh!

wujingyue commented Nov 17, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Factor Expr operand replacement in host IR lowering #5539

Factor Expr operand replacement in host IR lowering #5539

Uh oh!

Conversation

wujingyue commented Nov 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Nov 15, 2025

Description

Changes walkthrough

PR Reviewer Guide

Uh oh!

wujingyue Nov 15, 2025

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot commented Nov 15, 2025

Greptile Overview

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Sequence Diagram

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

Uh oh!

wujingyue commented Nov 15, 2025

Uh oh!

Priya2698 left a comment

Choose a reason for hiding this comment

Uh oh!

wujingyue commented Nov 17, 2025

Uh oh!

Priya2698 commented Nov 17, 2025

Uh oh!

wujingyue commented Nov 17, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

wujingyue commented Nov 15, 2025 •

edited

Loading