-
Notifications
You must be signed in to change notification settings - Fork 70
[SymMem 5/5] Contiguous View #5520
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: sym/bcast_allgather
Are you sure you want to change the base?
Conversation
Greptile OverviewGreptile SummaryThis PR implements Key changes:
How it works: Issue found:
Confidence Score: 4/5
Important Files ChangedFile Analysis
Sequence DiagramsequenceDiagram
participant User
participant HostIrEvaluator
participant SymMemHandleCache
participant SymMemForContiguousView
participant SymmetricTensor
participant CUDA_VMM as CUDA VMM/IPC
User->>HostIrEvaluator: handle(SymmetricContiguousView)
HostIrEvaluator->>HostIrEvaluator: getKnownConcreteValue(in_tv)
Note over HostIrEvaluator: Get sharded input tensor [1, N]
HostIrEvaluator->>SymMemHandleCache: get({in_tensor, expr})
alt Handle not in cache
SymMemHandleCache->>SymMemForContiguousView: new SymMemForContiguousView(in_tensor, expr)
SymMemForContiguousView->>SymmetricTensor: new SymmetricTensor(in_tensor)
SymMemForContiguousView->>SymmetricTensor: setupContiguousView(tag)
SymmetricTensor->>CUDA_VMM: cuMemAddressReserve(total_size)
Note over SymmetricTensor,CUDA_VMM: Reserve contiguous virtual address space<br/>for all ranks
loop For each rank
SymmetricTensor->>CUDA_VMM: Exchange IPC handles
SymmetricTensor->>CUDA_VMM: cuMemMap(region, getAllocHandle(rank))
SymmetricTensor->>CUDA_VMM: cuMemSetAccess(region, READWRITE)
end
SymmetricTensor->>SymmetricTensor: Create tensor from mapped memory
Note over SymmetricTensor: Shape: [world_size, ...local_shape]
SymmetricTensor-->>SymMemForContiguousView: Return contiguous_view
SymMemForContiguousView->>SymMemForContiguousView: squeeze(0) if size(0)==1
SymMemForContiguousView-->>SymMemHandleCache: Return handle
SymMemHandleCache->>SymMemHandleCache: Cache handle
end
SymMemHandleCache-->>HostIrEvaluator: Return cached handle
HostIrEvaluator->>SymMemForContiguousView: tensor()
SymMemForContiguousView-->>HostIrEvaluator: Return contiguous tensor [world_size, 1, N]
HostIrEvaluator->>HostIrEvaluator: squeeze(1)
Note over HostIrEvaluator: Final shape: [world_size, N]
HostIrEvaluator->>HostIrEvaluator: bind(out_tv, contiguous_tensor)
HostIrEvaluator-->>User: Return unsharded tensor
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
8 files reviewed, 1 comment
| handle->tensor().size(1) == 1, | ||
| "Contiguous view must have size 1 on sharded dimension"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
logic: checking size(1) assumes tensor has at least 2 dimensions, but input has shape [1, N] where dim 0 is the sharded dimension. After getContiguousView() returns [world_size, 1, N], checking size(1) == 1 is correct, but should verify tensor has enough dimensions first
| handle->tensor().size(1) == 1, | |
| "Contiguous view must have size 1 on sharded dimension"); | |
| NVF_ERROR( | |
| handle->tensor().dim() >= 2 && handle->tensor().size(1) == 1, | |
| "Contiguous view must have at least 2 dimensions with size 1 on second dimension"); |
a9db7af to
1e9afa3
Compare
SymmetricTensorruntime type #5517MemoryType::Symmetric#5518Full branch for reference: #5515