Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add NANOVDB_USE_SYNC_CUDA_MALLOC define to force sync CUDA malloc #1799

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

w0utert
Copy link

@w0utert w0utert commented Apr 25, 2024

In virtualized environments that slice up the GPU and share it between instances as vGPU's, GPU unified memory is usually disabled out of security considerations. Asynchronous CUDA malloc/free depends on GPU unified memory, so before, it was not possible to deploy and run NanoVDB code in such environments.

This commit adds macros CUDA_MALLOC and CUDA_FREE and replaces all CUDA alloc/free calls with these macros. CUDA_MALLOC and CUDA_FREE expand to asynchronous CUDA malloc & free if the following two conditions are met:

  • CUDA version needs to be >= 11.2 as this is the first version that supports cudaMallocAsync/cudaMallocFree
  • NANOVDB_USE_SYNC_CUDA_MALLOC needs to undefined

In all other cases, CUDA_MALLOC and CUDA_FREE expand to synchronous cudaMalloc/cudaFree.

Since NanoVDB is distributed as header-only, setting the NANOVDB_USE_SYNC_CUDA_MALLOC flag should be handled by the project's build system itself.

@w0utert w0utert requested a review from kmuseth as a code owner April 25, 2024 12:55
Copy link

linux-foundation-easycla bot commented Apr 25, 2024

CLA Signed


The committers listed above are authorized under a signed CLA.

w0utert added 2 commits April 25, 2024 15:18
In virtualized environments that slice up the GPU and share it
between instances as vGPU's, GPU unified memory is usually disabled
out of security considerations. Asynchronous CUDA malloc/free
depends on GPU unified memory, so before, it was not possible to
deploy and run NanoVDB code in such environments.

This commit adds macros CUDA_MALLOC and CUDA_FREE and replaces all
CUDA alloc/free calls with these macros. CUDA_MALLOC and CUDA_FREE
expand to asynchronous CUDA malloc & free if the following two
conditions are met:

  - CUDA version needs to be >= 11.2 as this is the first version
    that supports cudaMallocAsync/cudaMallocFree
  - NANOVDB_USE_SYNC_CUDA_MALLOC needs to undefined

In all other cases, CUDA_MALLOC and CUDA_FREE expand to synchronous
cudaMalloc/cudaFree.

Since NanoVDB is distributed as header-only, setting the
NANOVDB_USE_SYNC_CUDA_MALLOC flag should be handled by the project's
build system itself.

Signed-off-by: Wouter Bijlsma <[email protected]>
Signed-off-by: Wouter Bijlsma <[email protected]>
@w0utert w0utert force-pushed the sync-cuda-malloc-flag branch from 2e63b1e to 4c01b38 Compare April 25, 2024 13:18
Copy link
Contributor

@kmuseth kmuseth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

great contribution ! However, before I approve it let me try out your fix in the private development branch of NanoVDB. I will sync that repo up with the this (public) repo in the coming week - it includes several changes and improvements :)

Copy link
Contributor

@kmuseth kmuseth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

at closer inspection, why not simply replace this existing line:

#if CUDART_VERSION < 11020

with

#if (CUDART_VERSION < 11020) || defined(NANOVDB_USE_SYNC_CUDA_MALLOC)

This avoids the need to introduce the new macro and also works with existing client code of NanoVDB that may already be using cudaMallocAsync and cudaFreeAsync. I tried it on Linux but not yet on Windows :)

@w0utert
Copy link
Author

w0utert commented May 3, 2024

@kmuseth that was the first solution I tried, but it didn’t work, because cudaMalloc and cudaFree calls would still be resolved to the CUDA ones and not the redefined ones from the NanoVDB header. Without any modification to our build I still got CUDA ‘not supported’ errors on allocations.

Maybe this can be worked around with some linker directives but that seems brittle and could require possible annoying changes to the build system of projects that include the NanoVDB headers.

@w0utert
Copy link
Author

w0utert commented May 3, 2024

This was on Linux by the way, so it’s interesting it did work on your side, I assume there could be some link differences between our builds. I could double check next week to verify again to be sure and to find out why.

@kmuseth
Copy link
Contributor

kmuseth commented May 3, 2024

@w0utert I think you're right so I changed my implementation by simply placing the definitions of the functions in a namespace (nanovdb). So in the end my solution looks very much like yours, except I define functions vs macros. Let me create a PR that includes this fix plus a ton of other (long overdue) improvements to NanoVDB. I'll point you to the relevant changes so you can validate that it does indeed work for you.

A warning, this new PR introduces new namespaces in NanoVDB so your client code might need to be tweaked. I can of course help.

@w0utert
Copy link
Author

w0utert commented May 3, 2024

@kmuseth sounds good, that’s actually nicer than using macro’s! I will try your PR early next week and report back, but I’m pretty sure it will work.

@w0utert
Copy link
Author

w0utert commented May 17, 2024

@kmuseth
I tested the feature/nanovdb_v32.7 branch from your fork and it works on Linux as well, thanks!

@kmuseth
Copy link
Contributor

kmuseth commented May 24, 2024

@w0utert excellent - so are you okay if we close this PR?

@w0utert
Copy link
Author

w0utert commented May 24, 2024

@kmuseth yes, you can close this PR!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants