Skip to content

UCT/CUDA: Detect sys_dev for async allocations#10607

Merged
yosefe merged 1 commit intoopenucx:masterfrom
brminich:uct/fix_async_mem_detection
Apr 7, 2025
Merged

UCT/CUDA: Detect sys_dev for async allocations#10607
yosefe merged 1 commit intoopenucx:masterfrom
brminich:uct/fix_async_mem_detection

Conversation

@brminich
Copy link
Copy Markdown
Contributor

@brminich brminich commented Apr 4, 2025

What?

Currently, asynchronous memory can be detected as CUDA-managed, with its sys_dev set to unknown. However, for async VMM memory, knowing the sys_dev is crucial to identify the correct CUDA context for executing cuMemcpyAsync.

Why?

Test from #10601 fails when sending an eager message from legacy pinned memory to asynchronous VMM memory (on isr1). On the receiver side, the destination buffer is detected as CUDA-managed with an unknown sys_device, preventing the cuda_copy transport from selecting the correct context for the VMM allocation.

@brminich brminich changed the title UCT/CUDA: Detect sys_dev for sync allocations UCT/CUDA: Detect sys_dev for async allocations Apr 4, 2025
Currently, asynchronous memory can be detected as CUDA-managed,
with its sys_dev set to unknown. However, for async VMM memory,
knowing the sys_dev is crucial to identify the correct CUDA context
for executing cuMemcpyAsync.
@brminich brminich force-pushed the uct/fix_async_mem_detection branch from d8ebb65 to 88de9de Compare April 4, 2025 16:32
@yosefe yosefe merged commit 194aec9 into openucx:master Apr 7, 2025
151 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants