Skip to content

Conversation

rwgk
Copy link
Collaborator

@rwgk rwgk commented Oct 4, 2025

WIP — Will work on it more asap.

Copy link
Contributor

copy-pr-bot bot commented Oct 4, 2025

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

@rwgk
Copy link
Collaborator Author

rwgk commented Oct 4, 2025

/ok to test

@rwgk
Copy link
Collaborator Author

rwgk commented Oct 4, 2025

Local testing with a venv that does not have cuda-bindings:

pip install -v . --group test
...
Building wheels for collected packages: cuda-core
  Building wheel for cuda-core (pyproject.toml): started
  Running command Building wheel for cuda-core (pyproject.toml)
  CUDA paths: ['/usr/local/cuda']
  CUDA_VERSION from /usr/local/cuda/include/cuda.h: 13000
...
$ pip list
Package    Version
---------- -------
cuda-core  0.3.3a0
Cython     3.1.4
iniconfig  2.1.0
numpy      2.3.3
packaging  25.0
pip        25.2
pluggy     1.6.0
Pygments   2.19.2
pytest     8.4.2
setuptools 80.9.0

Copy link

github-actions bot commented Oct 4, 2025

@leofang
Copy link
Member

leofang commented Oct 4, 2025

Calling nvidia-smi is better because it does exactly the same thing under the hood (opening up libcuda/nvcuda and getting the driver version), but it is cross-platform and so takes much less lines of code, and it is a good habit to keep the code surface in build_hook.py small because it is not easy to write tests for modules outside of the source tree.

@rwgk
Copy link
Collaborator Author

rwgk commented Oct 4, 2025

Calling nvidia-smi is better because it does exactly the same thing under the hood (opening up libcuda/nvcuda and getting the driver version), but it is cross-platform and so takes much less lines of code, and it is a good habit to keep the code surface in build_hook.py small because it is not easy to write tests for modules outside of the source tree.

The main reason I was leaning into it is that I think it'd be a useful addition to pathfinder.

It's more lines of code only because subprocess is being reused.

Assume that the code I added lives in pathfinder, you'd just reuse that and have a more secure, more robust, and faster (which doesn't matter here, but is useful in general) solution with even less lines of code.

Copy link
Member

@leofang leofang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main reason I was leaning into it is that I think it'd be a useful addition to pathfinder.

Why would the pathfinder need to expose the driver version check as a public API? Or you only had the driver loading part in mind?

Comment on lines +113 to +117
try:
# WinDLL => stdcall (CUDAAPI on Windows), matches CUDA Driver API.
lib = ctypes.WinDLL("nvcuda.dll")
except OSError:
return None
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This snippet is all we need for Windows and the rest can go. This way we just need one function to encapsulate driver version fetching in a cross-platform fashion. The platform check can happen inside the callee, not the caller.

@leofang leofang added enhancement Any code-related improvements P1 Medium priority - Should do cuda.core Everything related to the cuda.core module labels Oct 6, 2025
@rwgk
Copy link
Collaborator Author

rwgk commented Oct 6, 2025

The main reason I was leaning into it is that I think it'd be a useful addition to pathfinder.

Why would the pathfinder need to expose the driver version check as a public API?

It'd be useful in general, not just here. It's just a few lines of Python code that is very similar to other functionality in pathfinder, and does not introduce any new dependencies. Having the function there, would make what's needed here a no-brainer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cuda.core Everything related to the cuda.core module enhancement Any code-related improvements P1 Medium priority - Should do
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants