Move `pathfinder` to cuda-python top level #723

rwgk · 2025-06-24T08:20:15Z

Description

Closes #719

Note: The name is changed from path_finder to pathfinder. The word "pathfinder" is very common, and it avoids underscore/dash confusion between directory names and package names.

Locally tested under Linux and Windows.

TODO:

.github/workflows updates (cuda_pathfinder needs to be pip-installed first)
cuda_bindings pyproject.toml → unconditionally depend on cuda_pathfinder
update pathfinder-related scripts in toolshed/ directory

…HOUT trying to make anything work.

… trying to make anything work.

…make anything work.

…ns.abc.Sequence (Python 3.9+)

…init__.py (to make the dynamic version work).

copy-pr-bot · 2025-06-24T08:20:19Z

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

…vidia_dynamic_lib

shwina · 2025-06-25T10:52:46Z

cuda_pathfinder/cuda/pathfinder/__init__.py

+def load_nvidia_dynamic_lib(libname: str) -> LoadedDL:
+    """Load a NVIDIA dynamic library by name.
+
+    Args:
+        libname: The name of the library to load (e.g. "cudart", "nvvm", etc.)
+
+    Returns:
+        A LoadedDL object containing the library handle and path
+
+    Raises:
+        RuntimeError: If the library cannot be found or loaded
+    """
+    return _load_nvidia_dynamic_lib.load_lib(libname)


You could define this wrapper function in the _dynamic_libs module and just import it here. This way your __init__.py here wouldn't have a ton of function definitions.

This file is meant to be the public API in one view.

This way your __init__.py here wouldn't have a ton of function definitions.

I understood exactly that was your goal: a flat list of available APIs.

Note that I moved the docstring here.

This file is meant to be the public API in one view.

Indeed, everything we define (or import) here that is not prefixed with _ constitutes the public API of the module cuda.path_finder. My above suggestion is just that we import load_nvidia_dynamic_lib rather than define it here.

That would make the public API far less obvious. E.g. to see the docstring, they'd need to open a private file.

I don't think that's true. They could still do:

from cuda.path_finder import load_nvidia_dynamic_lib help(load_nvidia_dynamic_lib)

Same as they would do now?

That'll only work interactively.

What I have now can be inspected in obvious ways directly in the sources, e.g. when looking at the sources on github. I can send URLs pointing to specific APIs in this init.py file.

Each function here will just be:

def function(...) -> ...: """docstring"" return call_into_private_code(...)

That's exactly the public API, with a one-line call that's easy to ignore.

What's the point of hiding that away, especially hiding away the docstring and the type hints?

That'll only work interactively.

The docstrings and type hints would also get picked up:

Sphinx (which we use to generate API docs
-pydoc

the developers' IDEs/lsp

Those are primarily the ways consumers of a package interact with docstrings or type hints, rather than looking directly at the source.

What's the point of hiding that away, especially hiding away the docstring and the type hints?

It tightens the scope of __init__.py, whose job is:

to include any initialization code for the module

to import stuff from submodules that the module wants to expose

to define __all__ for the module if needed

As examples, we can look at the __init__.py from some other popular libraries:

requests

pandas

It tightens the scope of init.py, whose job is

The most important job: Show the public API

You didn't answer why you want to hide that away.

I'm advocating for keeping function and type definitions outside of __init__.py. I don't think that "hides" anything from the user as we still import them here.

The main reason I'm advocating for that is because __init__.py typically only defines functions and types needed for module initialization, and imports anything else. I think the examples I linked to above are a good demonstration of that.

Defining functions and types beyond that serves to clutter __init__.py.

I would argue it makes the code base less navigable than more for people looking at the source. Do I expect the function load_nvidia_dynamic_libs to be defined in a file called _dynamic_libs.py, or a file called __init__.py?

If this all seems nitpicky, I apologize. While I do have a strong opinion here, I don't mind at all if another reviewer (or you, as the author of this PR) made the final call about this.

cuda.core's __init__.py is a good example for what Ashwin suggested. As a developer who occasionally peeks into someone else's __init__.py, I certainly get confused why we are defining things in there directly, as opposed to properly organizing them in respective public/private modules, and just import them. What's imported are considered public APIs (including types). Does it make sense?

rwgk · 2025-06-25T21:49:30Z

I'm a bit lost, to be honest, and I’d appreciate some clarification so I can move this forward in the direction you want.

The goals that seemed most important to me originally (from a Slack post last Wednesday) were:

Dependency isolation – avoid importing unrelated logic (e.g. dynamic lib loading) when only working with headers
Public API should be obvious from the code – minimal need for external docs or tooling to understand what’s exposed

After Ashwin’s comment on Slack, I interpreted the request as: prioritize a flat public API, even if that means giving up on some of that isolation — especially when type hints are involved.

Then Leo wrote:

as opposed to properly organizing them in respective public/private modules

That seems more aligned with what I had before the last commit (e.g., a file like nvidia_dynamic_libs.py directly under pathfinder/). But that seemed in conflict with Ashwin’s preference for flatness.

I’m happy to implement what you both want here — I just need a bit more clarity.

If we go back to what I proposed last week:

from cuda.pathfinder import nvidia_dynamic_libs, nvidia_static_libs, nvidia_headers

nvidia_dynamic_libs.load("nvrtc")
nvidia_static_libs.find("cudart")
nvidia_headers.find_file("cuda.h")

The file structure was:

cuda/pathfinder/
├── __init__.py  # empty
├── nvidia_dynamic_libs.py
├── nvidia_static_libs.py
└── nvidia_headers.py

I thought that structure balanced modularity and clarity, and users could still import just what they needed. But it sounds like we might instead prefer:

from cuda.pathfinder import load_nvidia_dynamic_lib, find_nvidia_static_lib, find_nvidia_header_file

That keeps the API flat, but does give up on the isolation goal unless we resort to function-local imports or other workarounds.

Could you please help me understand: What would be the preferred file/module structure to go along with the flat API? Should I keep separate modules like nvidia_dynamic_libs.py and re-export in __init__.py, or move everything into a single file (what would be the name)?

shwina · 2025-06-25T22:57:32Z

Given there's some urgency for getting this PR in, @rwgk @leofang do you want to proceed with the PR as is and address the questions Ralf posed above as a follow-up?

leofang · 2025-06-26T00:07:51Z

At this point we need to follow up offline. I think there is a gap that we unfortunately did not really cover during the weekly meeting. I suggest we don't cancel the dev meeting tomorrow and use the time to finalize it.

This PR also needs at least @kkraus14's approval for the change of license.

rwgk · 2025-07-03T17:42:24Z

/ok to test

rwgk · 2025-07-03T18:57:14Z

/ok to test

ci/tools/run-tests

leofang · 2025-07-03T19:44:25Z

.github/workflows/test-wheel-linux.yml

-            pip install cuda_python*.whl
+            pip install cuda_pathfinder*.whl cuda_python*.whl
          else
-            pip install $(ls cuda_python*.whl)[all]
+            pip install $(ls cuda_pathfinder*.whl) $(ls cuda_python*.whl)[all]


The changes to the cuda-python step should be reverted now that you install cuda-pathfinder the first.

leofang · 2025-07-03T19:44:36Z

.github/workflows/test-wheel-windows.yml

-            pip install (Get-ChildItem -Filter cuda_python*.whl).FullName
+            $pathfinder = (Get-ChildItem -Filter cuda_pathfinder*.whl).FullName
+            $umbrella = (Get-ChildItem -Filter cuda_python*.whl).FullName
+            pip install $pathfinder $umbrella
          } else {
-            pip install "$((Get-ChildItem -Filter cuda_python*.whl).FullName)[all]"
+            $pathfinder = (Get-ChildItem -Filter cuda_pathfinder*.whl).FullName
+            $umbrella = (Get-ChildItem -Filter cuda_python*.whl).FullName
+            pip install "$pathfinder" "$umbrella[all]"


The changes to the cuda-python step should be reverted now that you install cuda-pathfinder the first.

… wheels are installed already.

rwgk · 2025-07-03T20:19:17Z

/ok to test

…ironment variable.

…t_work|see_what_works (/ci/tools/run-tests).

rwgk · 2025-07-03T22:36:04Z

/ok to test

rwgk · 2025-07-03T23:09:20Z

@leofang Does this ring any bells?

https://github.com/NVIDIA/cuda-python/actions/runs/16061794063/job/45329524330?pr=723

FAILED tests/test_load_nvidia_dynamic_lib.py::test_load_nvidia_dynamic_lib[cudart] - RuntimeError: Child process failed for libname='cudart' with exit code 1
--- stdout-from-child-process ---
<end-of-stdout-from-child-process>
--- stderr-from-child-process ---
Traceback (most recent call last):
  File "C:\a\_tool\Python\3.12.10\x64\Lib\site-packages\cuda\pathfinder\_dynamic_libs\load_dl_windows.py", line 115, in load_with_abs_path
    handle = win32api.LoadLibraryEx(found_path, 0, flags)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
pywintypes.error: ([193](https://github.com/NVIDIA/cuda-python/actions/runs/16061794063/job/45329524330?pr=723#step:25:194), 'LoadLibraryEx', '%1 is not a valid Win32 application.')

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "C:\a\cuda-python\cuda-python\cuda_pathfinder\tests\spawned_process_runner.py", line 41, in __call__
    self.target(*self.args, **self.kwargs)
  File "C:\a\cuda-python\cuda-python\cuda_pathfinder\tests\test_load_nvidia_dynamic_lib.py", line 66, in child_process_func
    loaded_dl_fresh = load_nvidia_dynamic_lib(libname)
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\a\_tool\Python\3.12.10\x64\Lib\site-packages\cuda\pathfinder\_dynamic_libs\load_nvidia_dynamic_lib.py", line 61, in load_nvidia_dynamic_lib
    return _load_lib_no_cache(libname)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\a\_tool\Python\3.12.10\x64\Lib\site-packages\cuda\pathfinder\_dynamic_libs\load_nvidia_dynamic_lib.py", line 45, in _load_lib_no_cache
    return load_with_abs_path(libname, found.abs_path)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\a\_tool\Python\3.12.10\x64\Lib\site-packages\cuda\pathfinder\_dynamic_libs\load_dl_windows.py", line 117, in load_with_abs_path
    raise RuntimeError(f"Failed to load DLL at {found_path}: {e}") from e
RuntimeError: Failed to load DLL at C:\a\_tool\Python\3.12.10\x64\Lib\site-packages\nvidia\cuda_runtime\bin\cudart32_110.dll: (193, 'LoadLibraryEx', '%1 is not a valid Win32 application.')
<end-of-stderr-from-child-process>

rwgk · 2025-07-03T23:12:07Z

Assuming the last commit (bf48142) works as hoped, the Windows cudart 11.8.0 '%1 is not a valid Win32 application.' error may be the only error left.

leofang · 2025-07-04T02:50:40Z

This error happens usually because the exe or dll is not compatible with the platform, which is indeed the case based on the error log

RuntimeError: Failed to load DLL at C:\a\_tool\Python\3.12.10\x64\Lib\site-packages\nvidia\cuda_runtime\bin\cudart32_110.dll: (193, 'LoadLibraryEx', '%1 is not a valid Win32 application.')

We're trying to load the 32-bit cudart (cudart32_110.dll) on a 64-bit platform.

Another bug is: Both wheel and local CTK pipelines hit the same error message, meaning we also installed wheels to the local CTK pipelines. There is at lease one bug in the CI update that caused this.

leofang · 2025-07-04T03:25:22Z

cuda_pathfinder/cuda/pathfinder/_dynamic_libs/supported_nvidia_libs.py

Bug No.1: we should remove 32-bit cudart DLLs from line 253-255.

Done: commit 135a37d

leofang · 2025-07-04T03:28:37Z

.github/workflows/test-wheel-windows.yml

+      # Run only after cuda.bindings tests, so that all wheels are installed already.
+      - name: Run cuda.pathfinder tests


"Bug" No.2: cuda.pathfinder tests probably should be run after cuda.bindings but before cuda.core, because cuda.core needs the cudart wheel as a test dependency (it's used by CuPy, not by cuda.core).

In the long run, we should really move cuda.pathfinder to its own test workflow. This kind of execution-order-dependent issues is not obvious.

Hm, I made the change locally, but then backed it out again: After removing the 32-bit DLLs (135a37d), I'd expect the tests to work, and it tests a little bit more.

rwgk · 2025-07-05T21:19:06Z

/ok to test

github-actions · 2025-07-05T21:33:24Z

Doc Preview CI
🚀 View preview at https://nvidia.github.io/cuda-python/pr-preview/pr-723/
https://nvidia.github.io/cuda-python/pr-preview/pr-723/cuda-core/
https://nvidia.github.io/cuda-python/pr-preview/pr-723/cuda-bindings/
Preview will be ready when the GitHub Pages deployment is complete.

…quires 64-bit Python. Currently running: 32-bit Python

…", "nvvm32"))

…hfinder/pyproject.toml

…gain with all_must_work

rwgk · 2025-07-05T23:09:33Z

/ok to test

…HFINDER_TEST_LOAD_NVIDIA_DYNAMIC_LIB_STRICTNESS is defined (to resolve `unbound variable` error).

rwgk · 2025-07-06T03:23:41Z

/ok to test

rwgk · 2025-07-06T04:09:50Z

/ok to test

…dows (to see if that resolves the cusolver failures)

rwgk · 2025-07-06T05:10:36Z

/ok to test

rwgk added 15 commits June 22, 2025 11:09

Move files WITHOUT trying to make anything work.

fbdebfa

First pass changing imports from cuda.bindings → cuda.path_finder WIT…

731256d

…HOUT trying to make anything work.

Move README.md one level up.

6e0ec6c

Move find_sub_dirs.py → ../_utils/find_sub_dirs.py

07e1385

Move files from cuda_bindings/tests to cuda_path_finder/tests WITHOUT…

68aa8d5

… trying to make anything work.

First VERY ROUGH version of pyproject.toml

fafa998

Change imports in tests/ WITHOUT trying to make anything work.

4eff7d2

Clean up pyproject.toml (with the help of ChatGPT) WITHOUT trying to …

db0f540

…make anything work.

ruff automatic fix: Replace deprecated typing.Sequence with collectio…

9fea2e7

…ns.abc.Sequence (Python 3.9+)

Add dynamic = ["version"] in pyproject.toml and cuda/path_finder/__…

c494ef2

…init__.py (to make the dynamic version work).

Modernize [project] license line.

78b8c6d

Change dir/file names: path_finder → pathfinder

5690d96

Change imports: path_finder → pathfinder

81b9bd9

Fix up broken imports.

b9be5cb

add [project.optional-dependencies] test

79a8019

github-project-automation bot added this to CCCL Jun 24, 2025

github-project-automation bot moved this to Todo in CCCL Jun 24, 2025

leofang assigned rwgk Jun 24, 2025

leofang self-requested a review June 24, 2025 12:51

leofang added cuda.pathfinder Everything related to the cuda.pathfinder module enhancement Any code-related improvements P0 High priority - Must do! labels Jun 24, 2025

rwgk changed the title ~~Move path_finder to cuda-python top level~~ Move pathfinder to cuda-python top level Jun 24, 2025

rwgk added 2 commits June 24, 2025 17:18

Adopt cuda_core license.

975ba0b

cuda.pathfinder.nvidia_dynamic_libs.load_lib → cuda.pathfinder.load_n…

3678fb4

…vidia_dynamic_lib

shwina reviewed Jun 25, 2025

View reviewed changes

Download cuda-pathfinder-wheel into ./cuda_pathfinder (not .)

481f8de

leofang requested changes Jul 3, 2025

View reviewed changes

Run cuda.pathfinder tests only after cuda.bindings tests, so that all…

fd18e1e

… wheels are installed already.

rwgk added 3 commits July 3, 2025 14:24

Drive-by fix: test_find_or_load → test_load

796bd46

Introduce CUDA_PATHFINDER_TEST_LOAD_NVIDIA_DYNAMIC_LIB_STRICTNESS env…

06763af

…ironment variable.

CUDA_PATHFINDER_TEST_LOAD_NVIDIA_DYNAMIC_LIB_STRICTNESS=supported_mus…

b2135cf

…t_work|see_what_works (/ci/tools/run-tests).

Add cuda-pathfinder in .github/workflows/build-docs.yml

bf48142

leofang reviewed Jul 4, 2025

View reviewed changes

Remove 32-bit DLLs from SUPPORTED_WINDOWS_DLLS

135a37d

rwgk added 4 commits July 5, 2025 14:59

Add guard: RuntimeError: cuda.pathfinder.load_nvidia_dynamic_lib() re…

48c6d63

…quires 64-bit Python. Currently running: 32-bit Python

is_suppressed_dll_file(): return path_basename.startswith(("cudart32_…

7ab201a

…", "nvvm32"))

Add nvidia_wheels_cu12 to [project.optional-dependencies] in cuda_pat…

9f3e9a4

…hfinder/pyproject.toml

Install cuda.pathfinder nvidia_wheels_cu12 and run-tests pathfinder a…

00c8b6a

…gain with all_must_work

ci/tools/run-tests: use bash -v instead of -z to test if CUDA_PAT…

034286a

…HFINDER_TEST_LOAD_NVIDIA_DYNAMIC_LIB_STRICTNESS is defined (to resolve `unbound variable` error).

Add missing shell: bash line in test-wheel-windows.yml

aaed5f2

test_load_nvidia_dynamic_lib: increase timeout to 120 seconds for Win…

9d8c70c

…dows (to see if that resolves the cusolver failures)

		# Run only after cuda.bindings tests, so that all wheels are installed already.
		- name: Run cuda.pathfinder tests

Move pathfinder to cuda-python top level #723

Are you sure you want to change the base?

Move pathfinder to cuda-python top level #723

Conversation

rwgk commented Jun 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Uh oh!

copy-pr-bot bot commented Jun 24, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

shwina Jun 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

shwina Jun 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

leofang Jun 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

rwgk commented Jun 25, 2025

Uh oh!

shwina commented Jun 25, 2025

Uh oh!

leofang commented Jun 26, 2025

Uh oh!

rwgk commented Jul 3, 2025

Uh oh!

rwgk commented Jul 3, 2025

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

rwgk commented Jul 3, 2025

Uh oh!

rwgk commented Jul 3, 2025

Uh oh!

rwgk commented Jul 3, 2025

Uh oh!

rwgk commented Jul 3, 2025

Uh oh!

leofang commented Jul 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

rwgk commented Jul 5, 2025

Uh oh!

github-actions bot commented Jul 5, 2025

Preview will be ready when the GitHub Pages deployment is complete.

Uh oh!

rwgk commented Jul 5, 2025

Uh oh!

rwgk commented Jul 6, 2025

Uh oh!

rwgk commented Jul 6, 2025

Move `pathfinder` to cuda-python top level #723

Move `pathfinder` to cuda-python top level #723

rwgk commented Jun 24, 2025 •

edited

Loading

shwina Jun 25, 2025 •

edited

Loading

shwina Jun 25, 2025 •

edited

Loading

leofang Jun 25, 2025 •

edited

Loading

leofang commented Jul 4, 2025 •

edited

Loading