-
Notifications
You must be signed in to change notification settings - Fork 74
Refine several host IR nodes #5746
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Review updated until commit 7cad027 Description
|
| Relevant files | |||||||
|---|---|---|---|---|---|---|---|
| Enhancement |
| ||||||
| Tests |
|
PR Reviewer Guide
Here are some key observations to aid the review process:
| 🧪 PR contains tests |
| ⚡ Recommended focus areas for review |
API Breaking Changes
|
Test failures
-
(Medium, 1)
NVFuser validation failure (large output mismatch) in PingPongCircularBuffering tests on dlcluster_h100Test Name H100 Source PingPongCircularBuffering.StageSlicePositionComputeAt/stage_slice_position_4 ❌ Link
Greptile SummaryThis PR refines several host IR nodes to better integrate stream management with the IR structure. Key changes include:
These changes promote consistency in how the IR nodes represent their I/O semantics while simplifying the API and improving debugging output clarity. Confidence Score: 5/5
Important Files Changed
Sequence DiagramsequenceDiagram
participant Code as Code
participant GetCurrentStream as GetCurrentStream(stream)
participant HostIrEvaluator as HostIrEvaluator
participant CUDA as CUDA Runtime
Code->>GetCurrentStream: create with output stream
GetCurrentStream->>HostIrEvaluator: handle(GetCurrentStream)
HostIrEvaluator->>CUDA: getCurrentCUDAStream()
CUDA-->>HostIrEvaluator: cudaStream_t
HostIrEvaluator->>HostIrEvaluator: streams_[output_stream] = cuda_stream
Code->>Code: SetCurrentStream(stream)
Code->>HostIrEvaluator: handle(SetCurrentStream)
HostIrEvaluator->>CUDA: setCurrentCUDAStream(stream)
|
|
!test |
|
!test |
For #5308