Skip to content

Move pathfinder to cuda-python top level #723

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 57 commits into
base: main
Choose a base branch
from

Conversation

rwgk
Copy link
Collaborator

@rwgk rwgk commented Jun 24, 2025

Description

Closes #719

Note: The name is changed from path_finder to pathfinder. The word "pathfinder" is very common, and it avoids underscore/dash confusion between directory names and package names.

Locally tested under Linux and Windows.

TODO:

  • .github/workflows updates (cuda_pathfinder needs to be pip-installed first)
  • cuda_bindings pyproject.toml → unconditionally depend on cuda_pathfinder
  • update pathfinder-related scripts in toolshed/ directory

Copy link
Contributor

copy-pr-bot bot commented Jun 24, 2025

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

@leofang leofang self-requested a review June 24, 2025 12:51
@leofang leofang added cuda.pathfinder Everything related to the cuda.pathfinder module enhancement Any code-related improvements P0 High priority - Must do! labels Jun 24, 2025
@rwgk rwgk changed the title Move path_finder to cuda-python top level Move pathfinder to cuda-python top level Jun 24, 2025
Comment on lines 11 to 23
def load_nvidia_dynamic_lib(libname: str) -> LoadedDL:
"""Load a NVIDIA dynamic library by name.
Args:
libname: The name of the library to load (e.g. "cudart", "nvvm", etc.)
Returns:
A LoadedDL object containing the library handle and path
Raises:
RuntimeError: If the library cannot be found or loaded
"""
return _load_nvidia_dynamic_lib.load_lib(libname)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could define this wrapper function in the _dynamic_libs module and just import it here. This way your __init__.py here wouldn't have a ton of function definitions.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file is meant to be the public API in one view.

This way your __init__.py here wouldn't have a ton of function definitions.

I understood exactly that was your goal: a flat list of available APIs.

Note that I moved the docstring here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file is meant to be the public API in one view.

Indeed, everything we define (or import) here that is not prefixed with _ constitutes the public API of the module cuda.path_finder. My above suggestion is just that we import load_nvidia_dynamic_lib rather than define it here.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would make the public API far less obvious. E.g. to see the docstring, they'd need to open a private file.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that's true. They could still do:

from cuda.path_finder import load_nvidia_dynamic_lib
help(load_nvidia_dynamic_lib)

Same as they would do now?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That'll only work interactively.

What I have now can be inspected in obvious ways directly in the sources, e.g. when looking at the sources on github. I can send URLs pointing to specific APIs in this init.py file.

Each function here will just be:

def function(...) -> ...:
    """docstring""
    return call_into_private_code(...)

That's exactly the public API, with a one-line call that's easy to ignore.

What's the point of hiding that away, especially hiding away the docstring and the type hints?

Copy link
Contributor

@shwina shwina Jun 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That'll only work interactively.

The docstrings and type hints would also get picked up:

  • Sphinx (which we use to generate API docs
    -pydoc
  • the developers' IDEs/lsp

Those are primarily the ways consumers of a package interact with docstrings or type hints, rather than looking directly at the source.

What's the point of hiding that away, especially hiding away the docstring and the type hints?

It tightens the scope of __init__.py, whose job is:

  • to include any initialization code for the module
  • to import stuff from submodules that the module wants to expose
  • to define __all__ for the module if needed

As examples, we can look at the __init__.py from some other popular libraries:

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It tightens the scope of init.py, whose job is

The most important job: Show the public API

You didn't answer why you want to hide that away.

Copy link
Contributor

@shwina shwina Jun 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm advocating for keeping function and type definitions outside of __init__.py. I don't think that "hides" anything from the user as we still import them here.

The main reason I'm advocating for that is because __init__.py typically only defines functions and types needed for module initialization, and imports anything else. I think the examples I linked to above are a good demonstration of that.

Defining functions and types beyond that serves to clutter __init__.py.

I would argue it makes the code base less navigable than more for people looking at the source. Do I expect the function load_nvidia_dynamic_libs to be defined in a file called _dynamic_libs.py, or a file called __init__.py?

If this all seems nitpicky, I apologize. While I do have a strong opinion here, I don't mind at all if another reviewer (or you, as the author of this PR) made the final call about this.

Copy link
Member

@leofang leofang Jun 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cuda.core's __init__.py is a good example for what Ashwin suggested. As a developer who occasionally peeks into someone else's __init__.py, I certainly get confused why we are defining things in there directly, as opposed to properly organizing them in respective public/private modules, and just import them. What's imported are considered public APIs (including types). Does it make sense?

@rwgk
Copy link
Collaborator Author

rwgk commented Jun 25, 2025

I'm a bit lost, to be honest, and I’d appreciate some clarification so I can move this forward in the direction you want.

The goals that seemed most important to me originally (from a Slack post last Wednesday) were:

  1. Dependency isolation – avoid importing unrelated logic (e.g. dynamic lib loading) when only working with headers
  2. Public API should be obvious from the code – minimal need for external docs or tooling to understand what’s exposed

After Ashwin’s comment on Slack, I interpreted the request as: prioritize a flat public API, even if that means giving up on some of that isolation — especially when type hints are involved.

Then Leo wrote:

as opposed to properly organizing them in respective public/private modules

That seems more aligned with what I had before the last commit (e.g., a file like nvidia_dynamic_libs.py directly under pathfinder/). But that seemed in conflict with Ashwin’s preference for flatness.

I’m happy to implement what you both want here — I just need a bit more clarity.

If we go back to what I proposed last week:

from cuda.pathfinder import nvidia_dynamic_libs, nvidia_static_libs, nvidia_headers

nvidia_dynamic_libs.load("nvrtc")
nvidia_static_libs.find("cudart")
nvidia_headers.find_file("cuda.h")

The file structure was:

cuda/pathfinder/
├── __init__.py  # empty
├── nvidia_dynamic_libs.py
├── nvidia_static_libs.py
└── nvidia_headers.py

I thought that structure balanced modularity and clarity, and users could still import just what they needed. But it sounds like we might instead prefer:

from cuda.pathfinder import load_nvidia_dynamic_lib, find_nvidia_static_lib, find_nvidia_header_file

That keeps the API flat, but does give up on the isolation goal unless we resort to function-local imports or other workarounds.

Could you please help me understand: What would be the preferred file/module structure to go along with the flat API? Should I keep separate modules like nvidia_dynamic_libs.py and re-export in __init__.py, or move everything into a single file (what would be the name)?

@shwina
Copy link
Contributor

shwina commented Jun 25, 2025

Given there's some urgency for getting this PR in, @rwgk @leofang do you want to proceed with the PR as is and address the questions Ralf posed above as a follow-up?

@leofang
Copy link
Member

leofang commented Jun 26, 2025

At this point we need to follow up offline. I think there is a gap that we unfortunately did not really cover during the weekly meeting. I suggest we don't cancel the dev meeting tomorrow and use the time to finalize it.

This PR also needs at least @kkraus14's approval for the change of license.

@rwgk
Copy link
Collaborator Author

rwgk commented Jul 3, 2025

/ok to test

@rwgk
Copy link
Collaborator Author

rwgk commented Jul 3, 2025

/ok to test

Comment on lines 305 to 321
pip install cuda_python*.whl
pip install cuda_pathfinder*.whl cuda_python*.whl
else
pip install $(ls cuda_python*.whl)[all]
pip install $(ls cuda_pathfinder*.whl) $(ls cuda_python*.whl)[all]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes to the cuda-python step should be reverted now that you install cuda-pathfinder the first.

Comment on lines 271 to 277
pip install (Get-ChildItem -Filter cuda_python*.whl).FullName
$pathfinder = (Get-ChildItem -Filter cuda_pathfinder*.whl).FullName
$umbrella = (Get-ChildItem -Filter cuda_python*.whl).FullName
pip install $pathfinder $umbrella
} else {
pip install "$((Get-ChildItem -Filter cuda_python*.whl).FullName)[all]"
$pathfinder = (Get-ChildItem -Filter cuda_pathfinder*.whl).FullName
$umbrella = (Get-ChildItem -Filter cuda_python*.whl).FullName
pip install "$pathfinder" "$umbrella[all]"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes to the cuda-python step should be reverted now that you install cuda-pathfinder the first.

@rwgk
Copy link
Collaborator Author

rwgk commented Jul 3, 2025

/ok to test

@rwgk
Copy link
Collaborator Author

rwgk commented Jul 3, 2025

/ok to test

@rwgk
Copy link
Collaborator Author

rwgk commented Jul 3, 2025

@leofang Does this ring any bells?

FAILED tests/test_load_nvidia_dynamic_lib.py::test_load_nvidia_dynamic_lib[cudart] - RuntimeError: Child process failed for libname='cudart' with exit code 1
--- stdout-from-child-process ---
<end-of-stdout-from-child-process>
--- stderr-from-child-process ---
Traceback (most recent call last):
  File "C:\a\_tool\Python\3.12.10\x64\Lib\site-packages\cuda\pathfinder\_dynamic_libs\load_dl_windows.py", line 115, in load_with_abs_path
    handle = win32api.LoadLibraryEx(found_path, 0, flags)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
pywintypes.error: ([193](https://github.com/NVIDIA/cuda-python/actions/runs/16061794063/job/45329524330?pr=723#step:25:194), 'LoadLibraryEx', '%1 is not a valid Win32 application.')

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "C:\a\cuda-python\cuda-python\cuda_pathfinder\tests\spawned_process_runner.py", line 41, in __call__
    self.target(*self.args, **self.kwargs)
  File "C:\a\cuda-python\cuda-python\cuda_pathfinder\tests\test_load_nvidia_dynamic_lib.py", line 66, in child_process_func
    loaded_dl_fresh = load_nvidia_dynamic_lib(libname)
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\a\_tool\Python\3.12.10\x64\Lib\site-packages\cuda\pathfinder\_dynamic_libs\load_nvidia_dynamic_lib.py", line 61, in load_nvidia_dynamic_lib
    return _load_lib_no_cache(libname)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\a\_tool\Python\3.12.10\x64\Lib\site-packages\cuda\pathfinder\_dynamic_libs\load_nvidia_dynamic_lib.py", line 45, in _load_lib_no_cache
    return load_with_abs_path(libname, found.abs_path)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\a\_tool\Python\3.12.10\x64\Lib\site-packages\cuda\pathfinder\_dynamic_libs\load_dl_windows.py", line 117, in load_with_abs_path
    raise RuntimeError(f"Failed to load DLL at {found_path}: {e}") from e
RuntimeError: Failed to load DLL at C:\a\_tool\Python\3.12.10\x64\Lib\site-packages\nvidia\cuda_runtime\bin\cudart32_110.dll: (193, 'LoadLibraryEx', '%1 is not a valid Win32 application.')
<end-of-stderr-from-child-process>

@rwgk
Copy link
Collaborator Author

rwgk commented Jul 3, 2025

Assuming the last commit (bf48142) works as hoped, the Windows cudart 11.8.0 '%1 is not a valid Win32 application.' error may be the only error left.

@leofang
Copy link
Member

leofang commented Jul 4, 2025

This error happens usually because the exe or dll is not compatible with the platform, which is indeed the case based on the error log

RuntimeError: Failed to load DLL at C:\a\_tool\Python\3.12.10\x64\Lib\site-packages\nvidia\cuda_runtime\bin\cudart32_110.dll: (193, 'LoadLibraryEx', '%1 is not a valid Win32 application.')

We're trying to load the 32-bit cudart (cudart32_110.dll) on a 64-bit platform.

Another bug is: Both wheel and local CTK pipelines hit the same error message, meaning we also installed wheels to the local CTK pipelines. There is at lease one bug in the CI update that caused this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug No.1: we should remove 32-bit cudart DLLs from line 253-255.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done: commit 135a37d

Comment on lines +274 to +275
# Run only after cuda.bindings tests, so that all wheels are installed already.
- name: Run cuda.pathfinder tests
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Bug" No.2: cuda.pathfinder tests probably should be run after cuda.bindings but before cuda.core, because cuda.core needs the cudart wheel as a test dependency (it's used by CuPy, not by cuda.core).

In the long run, we should really move cuda.pathfinder to its own test workflow. This kind of execution-order-dependent issues is not obvious.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, I made the change locally, but then backed it out again: After removing the 32-bit DLLs (135a37d), I'd expect the tests to work, and it tests a little bit more.

@rwgk
Copy link
Collaborator Author

rwgk commented Jul 5, 2025

/ok to test

Copy link

github-actions bot commented Jul 5, 2025

@rwgk
Copy link
Collaborator Author

rwgk commented Jul 5, 2025

/ok to test

…HFINDER_TEST_LOAD_NVIDIA_DYNAMIC_LIB_STRICTNESS is defined (to resolve `unbound variable` error).
@rwgk
Copy link
Collaborator Author

rwgk commented Jul 6, 2025

/ok to test

@rwgk
Copy link
Collaborator Author

rwgk commented Jul 6, 2025

/ok to test

…dows (to see if that resolves the cusolver failures)
@rwgk
Copy link
Collaborator Author

rwgk commented Jul 6, 2025

/ok to test

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cuda.pathfinder Everything related to the cuda.pathfinder module enhancement Any code-related improvements P0 High priority - Must do!
Projects
Status: In Progress
Development

Successfully merging this pull request may close these issues.

[FEA]: Move path_finder to cuda-python top level
4 participants