[FEA] and [DOC]: Diagnostic when `par_nosync` is used but the algorithm will block anyways. #4118

brycelelbach · 2025-03-12T17:11:48Z

I’d very much rather have Thrust raise a compile-time error when thrust::something(thrust::cuda::par_nosync, ...); is called and something can only be blocking. I don’t want new API names because the API surface is already very large and I need my muscle memory. However, I do want to know the run-time behavior at compile time, so that I can rely on the compiler to tell me to fix my code. Is it already done today?

We should do this. Even if you don't want to expose this to users (which we should), we want something like this so that we can write unit tests that verify whether algorithms block or not, and improve our documentation of which algorithms block.

Options:

Add a new nosync policy type that will fail at compile time if blocking would happen.
Add a macro that would make the existing par_nosync fail at compile time if blocking happens. I don't like macros for this.
Any of the above options, but with compile time warnings instead of failures.

To both implement this feature and to improve our docs, we need to funnel all blocking operations through a common interface.

We already do this for some obvious operations like cuda(Stream|Device)Synchronize.
We don't do this for cudaFree, which always blocks.
We don't do this for cudaMalloc, which rarely blocks, but can under some circumstances on multi-GPU systems which cross-GPU allocation visibility enabled.
There may be other CUDA runtime interfaces that block under subtle conditions. We should speak with someone from the CUDA driver team and get a comprehensive list.

Does the CUDA driver/runtime accurately document which interfaces can block?

The text was updated successfully, but these errors were encountered:

leofang · 2025-03-12T17:16:58Z

Does the CUDA driver/runtime accurately document which interfaces can block?

Not to my knowledge, no.

github-project-automation bot added this to CCCL Mar 12, 2025

github-project-automation bot moved this to Todo in CCCL Mar 12, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FEA] and [DOC]: Diagnostic when `par_nosync` is used but the algorithm will block anyways. #4118

[FEA] and [DOC]: Diagnostic when `par_nosync` is used but the algorithm will block anyways. #4118

brycelelbach commented Mar 12, 2025 •

edited

Loading

leofang commented Mar 12, 2025

[FEA] and [DOC]: Diagnostic when par_nosync is used but the algorithm will block anyways. #4118

[FEA] and [DOC]: Diagnostic when par_nosync is used but the algorithm will block anyways. #4118

Comments

brycelelbach commented Mar 12, 2025 • edited Loading

leofang commented Mar 12, 2025

[FEA] and [DOC]: Diagnostic when `par_nosync` is used but the algorithm will block anyways. #4118

[FEA] and [DOC]: Diagnostic when `par_nosync` is used but the algorithm will block anyways. #4118

brycelelbach commented Mar 12, 2025 •

edited

Loading