-
Notifications
You must be signed in to change notification settings - Fork 453
UCM/CUDA: Correct region size for pitched CUDA allocation hooks #10527
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
3c9dcc2
to
985bcc6
Compare
@hoopoepg Please have a look at this PR when you get a chance. There are some failed checks, but I don't think they are related |
hello, have you or your organization signed the CLA https://openucx.org/license/? |
No, not yet. Planning to sign an individual CLA, because PR is unrelated to my organization. Is it still possible to send a signed version via email as described here? UPD: Sorry, found it at the very top of "Guidance for contributors" page |
Co-authored-by: Raul Akhmetshin <[email protected]>
What?
For pitched CUDA memory allocations (cuMemAllocPitch, cuMemAllocPitch_v2, cudaMallocPitch, cudaMalloc3D) the allocated region may be bigger than requested. The UCM code incorrectly marked those regions to have
width * height[ * depth]
size, but their actual size ispitch * height[ * depth]
.Why?
Incorrect allocated region size leads to part of it having incorrect UCS_MEMORY_TYPE and further attempts to access device memory as host memory.
Closes #10526