Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What part of &CudaSlice<T> can be mutated by a CudaFunction? #272

Open
workingjubilee opened this issue Jul 10, 2024 · 3 comments
Open

What part of &CudaSlice<T> can be mutated by a CudaFunction? #272

workingjubilee opened this issue Jul 10, 2024 · 3 comments
Labels
documentation Improvements or additions to documentation question Further information is requested

Comments

@workingjubilee
Copy link

Currently, it is defined thusly:

pub struct CudaSlice<T> {
    pub(crate) cu_device_ptr: sys::CUdeviceptr,
    pub(crate) len: usize,
    pub(crate) device: Arc<CudaDevice>,
    pub(crate) host_buf: Option<Pin<Vec<T>>>,
}

That's 4 fields. Which of these is potentially mutated by a CudaFunction, per this note?

@coreylowman
Copy link
Owner

The data on the GPU pointed to by cu_device_ptr. Note that kernels can do anything with that pointer, so if a kernel extends past that pointer it will modify other data in gpu device memory (not the same as host memory).

The only field passed to the kernel when using a CudaSlice is the cu_device_ptr. The other fields are not passed unless you explicitly include that as a parameter in launch.

Would probably be good to add this information to the docs

@coreylowman coreylowman added documentation Improvements or additions to documentation question Further information is requested labels Jul 16, 2024
@workingjubilee
Copy link
Author

It would be useful, yes.

I suggest relating the slice to being essentially &[UnsafeCell<T>], because of this. That, or to require CudaFunctions to only execute after all views into the CudaSlice from Rust have ended.

Certainly, &mut CudaSlice<T> should not exist in Rust while a CudaFunction is running.

@coreylowman
Copy link
Owner

coreylowman commented Jul 16, 2024

Ideally yes, but practically that makes the API very hard to use. Instead we just make everything involved in calling a kernel unsafe, so all users of the launching api have to ensure that the constraints are not violated themselves, because we cannot do that in this library without making the API really complex (feel free to try a different approach if you believe you can make it work!)

I'll also add that you really can only mutate CudaSlice<T> using the cuda kernels, so we at least don't have to worry about users modifying the data on the rust side at the same time as a cuda kernel is. Every function (at the safe level) that transfers from device to host has synchronous calls inside of it to cover this case (i.e. wait for all kernels to finish running before you can copy data from the device to host side).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants