Implement Send + Sync for CudaStream #254
Closed
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Hi !
First, thank you for the cudarc project. It's a nice piece of engineering.
In my Novigrad project, I use cudarc.
My models are declared like in PyTorch (but in Rust).
Then, I generate instructions with opcodes and operands.
Using a dependency analysis, I assign each instruction to a logical stream. Logical streams have dependencies too.
At runtime, a logical stream is mapped to a physical CUDA stream.
For example, for multi-head attention, with 12 attention heads, the dependency analysis figures out that each head can be executed on a separate logical stream. Then the logical stream for Concat depends on the 12 logical streams of the heads.
I use a scheduler with a controller and execution units in a multi-threaded implementation.
It requires CudaStream to implement Send + Sync.
I ran the tests (cargo test --release). I don't currently have cuddn and nccl, so those tests failed.