Skip to content

Conversation

@wujingyue
Copy link
Collaborator

@wujingyue wujingyue commented Jan 2, 2026

For #5308

  • Make some attributes inputs/outputs.
  • Refine some toString methods.

@wujingyue wujingyue requested a review from mdavis36 January 2, 2026 07:58
@github-actions
Copy link

github-actions bot commented Jan 2, 2026

Review updated until commit 7cad027

Description

  • Converted several host IR nodes to use inputs/outputs instead of attributes for stream handling

  • Updated GetCurrentStream to output a stream parameter instead of creating one internally

  • Refined toString/toInlineString methods for better formatting and consistency

  • Updated test cases to match new constructor signatures

Changes walkthrough

Relevant files
Enhancement
ir.cpp
Implement host IR node refinements and stream handling improvements

csrc/host_ir/ir.cpp

  • Removed toInlineString from LaunchKernel (no longer needed)
  • Updated Stream::toString to print address when index is null for
    HostIrEvaluator
  • Modified SetCurrentStream constructor to use inputs instead of
    attributes
  • Updated GetCurrentStream to take stream as output parameter
  • Implemented toInlineString for Wait and Synchronize nodes
  • Updated Synchronize constructor to use inputs instead of attributes
  • Simplified ShardByStream toString formatting
  • +21/-38 
    ir.h
    Update host IR header definitions for improved stream management

    csrc/host_ir/ir.h

  • Removed toInlineString declarations from LaunchKernel,
    SetCurrentStream, and ShardByStream
  • Updated SetCurrentStream::stream() to return inputs().at(0) instead of
    attributes_.at(0)
  • Modified GetCurrentStream constructor to accept Stream* parameter
  • Updated GetCurrentStream::stream() to return outputs().at(0)
  • Updated Synchronize::stream() to return inputs().at(0)
  • Removed sameAs method declarations from SetCurrentStream and
    ShardByStream
  • +4/-9     
    stream_parallel_type.cpp
    Update stream management pass for new GetCurrentStream interface

    csrc/host_ir/pass/stream_parallel_type.cpp

  • Updated GetCurrentStream creation to pass existing stream instead of
    creating new one
  • Modified stream management logic to use pre-created stream objects
  • +3/-2     
    Tests
    test_host_irs.cpp
    Fix test cases for updated GetCurrentStream interface       

    tests/cpp/test_host_irs.cpp

  • Updated test to create Stream object before GetCurrentStream
    construction
  • Modified test to match new GetCurrentStream constructor signature
  • +2/-2     

    PR Reviewer Guide

    Here are some key observations to aid the review process:

    🧪 PR contains tests
    ⚡ Recommended focus areas for review
    API Breaking Changes

    The PR makes significant API breaking changes to constructors of SetCurrentStream, GetCurrentStream, and Synchronize classes. These changes modify how stream objects are passed/created, changing from internal attribute creation to explicit input/output parameters. While the test file shows updated usage, there may be other parts of the codebase that need to be updated to work with these new constructor signatures.

    SetCurrentStream::SetCurrentStream(IrBuilderPasskey passkey, Stream* stream)
        : Expr(passkey, {stream}, {}, {}) {
      NVF_ERROR(passkey.ir_container_ != nullptr);
      NVF_ERROR(passkey.ir_container_->isA<HostIrContainer>());
    }
    
    NVFUSER_DEFINE_CLONE_AND_CREATE(SetCurrentStream)
    
    std::string SetCurrentStream::toString(int indent_size) const {
      std::stringstream ss;
      indent(ss, indent_size) << "SetCurrentStream(" << stream()->toString() << ")"
                              << std::endl;
      return ss.str();
    }
    
    GetCurrentStream::GetCurrentStream(IrBuilderPasskey passkey, Stream* stream)
        : Expr(passkey, {}, {stream}, {}) {
      NVF_ERROR(passkey.ir_container_ != nullptr);
      NVF_ERROR(passkey.ir_container_->isA<HostIrContainer>());
    }
    Stream Address Printing

    The change to print stream addresses as identifiers when index is null could potentially cause issues if the address representation is not stable across runs or if it causes readability problems in logs/debug output. The comment explains the rationale but this should be validated with actual usage patterns.

      // HostIrEvaluator looks up streams by address when index is null.
      // Print address as identifier. We used to print `name()` but that's often
      // an integer which would be confusing/ambiguous with the index.
      ss << static_cast<const void*>(this);
    } else {
      ss << index()->toInlineString();
    }
    Wait and Synchronize Formatting

    The changes to Wait and Synchronize toString methods now return inline format with newline appended, which is a different formatting approach than before. This should be verified to maintain consistent output formatting across the codebase.

    std::string Wait::toString(int indent_size) const {
      return toInlineString(indent_size) + "\n";
    }
    
    std::string Wait::toInlineString(int indent_size) const {
      std::stringstream ss;
      indent(ss, indent_size) << "Wait(Communication " << communication()->name()
                              << ")";
      return ss.str();
    }
    
    // TODO: implement
    bool Wait::sameAs(const Statement* other) const {
      return false;
    }
    
    Synchronize::Synchronize(IrBuilderPasskey passkey, Stream* stream)
        : Expr(passkey, {stream}, {}, {}) {
      NVF_ERROR(passkey.ir_container_ != nullptr);
      NVF_ERROR(
          passkey.ir_container_->isA<HostIrContainer>(),
          this,
          "must be registered in a HostIrContainer");
    }
    
    NVFUSER_DEFINE_CLONE_AND_CREATE(Synchronize)
    
    std::string Synchronize::toString(int indent_size) const {
      return toInlineString(indent_size) + "\n";
    }
    
    std::string Synchronize::toInlineString(int indent_size) const {
      std::stringstream ss;
      indent(ss, indent_size) << "Synchronize(" << stream() << ")";
      return ss.str();
    }

    Test failures

    • (Medium, 1) NVFuser validation failure (large output mismatch) in PingPongCircularBuffering tests on dlcluster_h100

      Test Name H100 Source
      PingPongCircularBuffering.StageSlicePositionComputeAt/stage_slice_position_4 Link

    @greptile-apps
    Copy link
    Contributor

    greptile-apps bot commented Jan 2, 2026

    Greptile Summary

    This PR refines several host IR nodes to better integrate stream management with the IR structure. Key changes include:

    • Migrate stream storage in SetCurrentStream and Synchronize from attributes to inputs, and in GetCurrentStream from being internally created as an attribute to being an explicit output parameter
    • Remove redundant toInlineString() methods that only threw errors from LaunchKernel and ShardByStream
    • Improve Stream::toString() to print pointer addresses for unnamed streams (which the evaluator uses for lookup) instead of names that could be ambiguous with stream indices
    • Refactor Wait and Synchronize toString() methods to use their toInlineString() methods internally
    • Update all call sites in stream_parallel_type.cpp and tests to work with the new API

    These changes promote consistency in how the IR nodes represent their I/O semantics while simplifying the API and improving debugging output clarity.

    Confidence Score: 5/5

    • This PR is safe to merge with high confidence - all changes are well-scoped API refinements with consistent updates across all call sites and tests.
    • The PR makes coherent architectural improvements to stream management in host IR nodes: moving stream representation from attributes to inputs/outputs where they semantically belong, removing stub methods that only threw errors, and improving debugging output. All changes are systematic - every modification to a class definition is matched by corresponding updates to constructors, accessors, and all call sites. The test updates verify the new API works correctly. The changes are low-risk refactoring with clear semantic improvements.
    • No files require special attention

    Important Files Changed

    Filename Overview
    csrc/host_ir/ir.h Header changes refactor stream IR nodes: SetCurrentStream and Synchronize now access stream via inputs(), GetCurrentStream takes stream parameter in constructor and accesses via outputs(), removed stub toInlineString() methods from LaunchKernel and ShardByStream, removed sameAs() from SetCurrentStream.
    csrc/host_ir/ir.cpp Implementation updates match header changes: constructors updated to pass streams to inputs/outputs, removed stub toInlineString() methods, Wait and Synchronize refactored to use toInlineString() internally, Stream::toString() now prints pointer address for unnamed streams (better for evaluator lookup), SetCurrentStream sameAs stub removed.
    csrc/host_ir/pass/stream_parallel_type.cpp Updated to match new GetCurrentStream API: creates stream separately and passes it to constructor instead of extracting it from the IR node after construction.
    tests/cpp/test_host_irs.cpp Updated test to match new GetCurrentStream API: creates stream explicitly and passes it to constructor rather than extracting it from the IR node.

    Sequence Diagram

    sequenceDiagram
        participant Code as Code
        participant GetCurrentStream as GetCurrentStream(stream)
        participant HostIrEvaluator as HostIrEvaluator
        participant CUDA as CUDA Runtime
        
        Code->>GetCurrentStream: create with output stream
        GetCurrentStream->>HostIrEvaluator: handle(GetCurrentStream)
        HostIrEvaluator->>CUDA: getCurrentCUDAStream()
        CUDA-->>HostIrEvaluator: cudaStream_t
        HostIrEvaluator->>HostIrEvaluator: streams_[output_stream] = cuda_stream
        
        Code->>Code: SetCurrentStream(stream)
        Code->>HostIrEvaluator: handle(SetCurrentStream)
        HostIrEvaluator->>CUDA: setCurrentCUDAStream(stream)
    
    
    Loading

    @wujingyue wujingyue changed the title Refine host stream get/set IR nodes Refine several host IR nodes Jan 6, 2026
    @wujingyue
    Copy link
    Collaborator Author

    !test

    @wujingyue
    Copy link
    Collaborator Author

    !test

    @wujingyue wujingyue merged commit 8f811e1 into main Jan 6, 2026
    60 of 61 checks passed
    @wujingyue wujingyue deleted the wjy/getset branch January 6, 2026 16:08
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

    Labels

    None yet

    Projects

    None yet

    Development

    Successfully merging this pull request may close these issues.

    3 participants