Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEA] and [DOC]: Diagnostic when par_nosync is used but the algorithm will block anyways. #4118

Open
brycelelbach opened this issue Mar 12, 2025 · 1 comment

Comments

@brycelelbach
Copy link
Contributor

brycelelbach commented Mar 12, 2025

From @leofang

I’d very much rather have Thrust raise a compile-time error when thrust::something(thrust::cuda::par_nosync, ...); is called and something can only be blocking. I don’t want new API names because the API surface is already very large and I need my muscle memory. However, I do want to know the run-time behavior at compile time, so that I can rely on the compiler to tell me to fix my code. Is it already done today?

We should do this. Even if you don't want to expose this to users (which we should), we want something like this so that we can write unit tests that verify whether algorithms block or not, and improve our documentation of which algorithms block.

Options:

  • Add a new nosync policy type that will fail at compile time if blocking would happen.
  • Add a macro that would make the existing par_nosync fail at compile time if blocking happens. I don't like macros for this.
  • Any of the above options, but with compile time warnings instead of failures.

To both implement this feature and to improve our docs, we need to funnel all blocking operations through a common interface.

  • We already do this for some obvious operations like cuda(Stream|Device)Synchronize.
  • We don't do this for cudaFree, which always blocks.
  • We don't do this for cudaMalloc, which rarely blocks, but can under some circumstances on multi-GPU systems which cross-GPU allocation visibility enabled.
  • There may be other CUDA runtime interfaces that block under subtle conditions. We should speak with someone from the CUDA driver team and get a comprehensive list.

Does the CUDA driver/runtime accurately document which interfaces can block?

@leofang
Copy link
Member

leofang commented Mar 12, 2025

Does the CUDA driver/runtime accurately document which interfaces can block?

Not to my knowledge, no.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Todo
Development

No branches or pull requests

2 participants