Use `cuda.bindings` and `cuda.core` for `Linker` #133

brandon-b-miller · 2025-02-21T19:58:37Z

WIP
xref #129

leofang · 2025-02-22T02:17:10Z

Thanks, @brandon-b-miller. Remember our goal is to drop every Linker subclasses inside Numba, in favor of cuda.core.Linker. The current PR is not what we want. Also note that to help pynvjitlink to phase out, we already have rapidsai/pynvjitlink#111 which is essentially what this PR does today.

copy-pr-bot · 2025-02-24T14:12:10Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

isVoid · 2025-03-04T02:07:51Z

numba_cuda/numba/cuda/codegen.py


        # Load
-        cufunc = module.get_function(self._entry_name)
+        #cufunc = module.get_function(self._entry_name)
+        cufunc = cubin.get_kernel(self._entry_name)


Since we switched to context independent loading API, the CUDADispatcher.bind method should probably be renamed since it no longer binds context via calling the get_cufunc function.

brandon-b-miller · 2025-03-14T14:08:27Z

/ok to test

brandon-b-miller · 2025-03-14T14:29:56Z

@gmarkall @leofang numba-cuda contains nvjitlink tests, should we maintain support for these as part of this PR or drop them in favor of testing in upstream cuda-python?

gmarkall · 2025-03-14T14:45:57Z

@gmarkall @leofang numba-cuda contains nvjitlink tests, should we maintain support for these as part of this PR or drop them in favor of testing in upstream cuda-python?

I think we need to maintain the tests that test Numba-CUDA's interaction with the linker, like the TestLinkerUsage class and the test_*_with_linkable_code (tests with names like that). I don't think we need to keep the tests that purely test the PyNvJitLinker API like the ones that test passing different flags etc. to it.

gmarkall · 2025-03-14T14:47:52Z

Also, I think we can probably delete the PyNvJitLinker class in this PR as well - is there any reason to keep it around?

I'm comfortable with:

Using cuda.core.Linker when the user asks for pynvjitlink or the NVIDIA bindings, and
Using the ctypes linker otherwise

which is what this PR seems to offer. (correct me if I've read it wrong 🙂)

brandon-b-miller · 2025-03-14T14:50:16Z

Also, I think we can probably delete the PyNvJitLinker class in this PR as well - is there any reason to keep it around?

I'm comfortable with:

Using cuda.core.Linker when the user asks for pynvjitlink or the NVIDIA bindings, and

Using the ctypes linker otherwise

which is what this PR seems to offer. (correct me if I've read it wrong 🙂)

Correct, this is the outcome I am aiming for.

brandon-b-miller · 2025-03-14T16:37:22Z

@gmarkall on second thought, we might need to leave the MVCLinker in in some capacity as long as we're supporting cuda 11. I don't think that cuda-python supports the functionality that cubinlinker enables.

gmarkall · 2025-03-14T19:40:19Z

@brandon-b-miller Sorry, yes - I had that in mind but didn't write it down.

brandon-b-miller · 2025-03-17T16:33:08Z

@brandon-b-miller Sorry, yes - I had that in mind but didn't write it down.

Ok, just to have it written down somewhere, after this PR we will:

For cuda 11, maintain the current way of configuring which bindings to use:

Default ctypes bindings, optional cuda-python bindings with NUMBA_CUDA_USE_NVIDIA_BINDING=1, optional MVCLinker with NUMBA_CUDA_ENABLE_MINOR_VERSION_COMPATIBILITY=1.

for cuda 12, we will have:

Default ctypes bindings, optional cuda-python bindings with NUMBA_CUDA_USE_NVIDIA_BINDING=1
Use of pynvjitlink through cuda-python if NUMBA_CUDA_ENBALE_PYNVJITLINK=1

This will leave us with 3 linkers:

The ctypes linker which is used by default regardless of cuda version
the mvc linker which is used in a cuda 11 environment when mvc is required, regardless of what binding is being used
the new linker which is used in a cuda 12 enviornment either the cuda-python bindings or pynvjitlink is enabled

gmarkall · 2025-03-17T16:51:28Z

Thanks for the summary! To look a little further ahead, we want to end up with only one linker, which is the new linker. This would be achieved by deprecating / removing the other linkers as soon as appropriate:

MVCLinker can be removed as soon as CUDA 11 support is dropped.
The ctypes linker can be deprecated and removed whenever we can have a hard dependency on cuda.core and had tested the new linker in use for a bit to shake out any issues.

Are you in alignment with the above plan @brandon-b-miller ?

brandon-b-miller · 2025-03-17T16:58:27Z

Thanks for the summary! To look a little further ahead, we want to end up with only one linker, which is the new linker. This would be achieved by deprecating / removing the other linkers as soon as appropriate:

MVCLinker can be removed as soon as CUDA 11 support is dropped.

The ctypes linker can be deprecated and removed whenever we can have a hard dependency on cuda.core and had tested the new linker in use for a bit to shake out any issues.

Are you in alignment with the above plan @brandon-b-miller ?

Yup this sounds good to me.

brandon-b-miller · 2025-03-19T14:28:09Z

numba_cuda/numba/cuda/cudadrv/driver.py

+        self._object_codes.append(obj)
+
+
+    def add_library(self, lib, name='<cudapy-lib>'):


There's a few cases in this class where I'm finding I have to do something like this. We discussed wanting to avoid having an ObjectCode constructor for every type of possible input, but it seems like numba requires the ability to assemble the eventual inputs to the link from a pretty broad variety of sources.

It seems like we need an API to be able to construct ObjectCode instances from inputs of arbitrary kinds - archives, cubins, LTOIR, fatbins, etc.. It seems like the existing interface is aimed at providing something relatively safe, but doesn't allow for the lower-level / fine-grained control we need here to set up and complete a link.

Is that understanding correct?

This is what I've arrived at doing, @leofang and I have had a few conversations about how this capability would fit into the broader goals of the cuda-python API, I'd be curious his thoughts.

gmarkall · 2025-03-19T16:13:14Z

/ok to test

begin replacing pynvjitlinker

6b66886

brandon-b-miller added 3 commits February 24, 2025 07:30

Merge branch 'main' into cuda-core-linker

2c5e69e

begin implementing cuda-python linker

b44f6bf

fix misconfigured precommit

5f3eff0

brandon-b-miller changed the title ~~Use cuda.bindings and cuda.core for nvjitlink~~ Use cuda.bindings and cuda.core for Linker Feb 24, 2025

isVoid mentioned this pull request Mar 3, 2025

Add Module Setup and Teardown Callback to Linkable Code Interface #145

Open

isVoid reviewed Mar 4, 2025

View reviewed changes

brandon-b-miller added 4 commits March 4, 2025 08:59

properly handle module image

fa80a4f

almost pass linker tests

049ff57

trying to pass more tests

9aeff49

clean

b551d6f

gmarkall added the 2 - In Progress Currently a work in progress label Mar 7, 2025

brandon-b-miller added 5 commits March 10, 2025 13:16

pass includes correctly

ab9ac6f

pass a few more linker tests

172e340

Context.create_module_ptx wraps PTX in an ObjectCode instance

4334b8b

workaround passing lto=False to cuda-python

0c8171a

drop to ctypes ptr in nrt when using nv binding

cc70f1d

pass more tests

1ed3520

brandon-b-miller added 3 commits March 17, 2025 14:32

pass more tests

badafc6

partial fixing of more tests

bdd1f19

clean

d6e682d

brandon-b-miller commented Mar 19, 2025

View reviewed changes

brandon-b-miller added 2 commits March 19, 2025 13:44

merge/resolve/incorporate memsys fixes from main

2ac2796

fix test_linking_cu_log_warning

e444fb1

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use `cuda.bindings` and `cuda.core` for `Linker` #133

Use `cuda.bindings` and `cuda.core` for `Linker` #133

brandon-b-miller commented Feb 21, 2025

leofang commented Feb 22, 2025

copy-pr-bot bot commented Feb 24, 2025

isVoid Mar 4, 2025

brandon-b-miller commented Mar 14, 2025

brandon-b-miller commented Mar 14, 2025

gmarkall commented Mar 14, 2025

gmarkall commented Mar 14, 2025

brandon-b-miller commented Mar 14, 2025

brandon-b-miller commented Mar 14, 2025

gmarkall commented Mar 14, 2025

brandon-b-miller commented Mar 17, 2025

gmarkall commented Mar 17, 2025

brandon-b-miller commented Mar 17, 2025

brandon-b-miller Mar 19, 2025

gmarkall Mar 19, 2025

brandon-b-miller Mar 19, 2025

gmarkall commented Mar 19, 2025

		self._object_codes.append(obj)


		def add_library(self, lib, name='<cudapy-lib>'):

Use cuda.bindings and cuda.core for Linker #133

Are you sure you want to change the base?

Use cuda.bindings and cuda.core for Linker #133

Conversation

brandon-b-miller commented Feb 21, 2025

leofang commented Feb 22, 2025

copy-pr-bot bot commented Feb 24, 2025

isVoid Mar 4, 2025

Choose a reason for hiding this comment

brandon-b-miller commented Mar 14, 2025

brandon-b-miller commented Mar 14, 2025

gmarkall commented Mar 14, 2025

gmarkall commented Mar 14, 2025

brandon-b-miller commented Mar 14, 2025

brandon-b-miller commented Mar 14, 2025

gmarkall commented Mar 14, 2025

brandon-b-miller commented Mar 17, 2025

gmarkall commented Mar 17, 2025

brandon-b-miller commented Mar 17, 2025

brandon-b-miller Mar 19, 2025

Choose a reason for hiding this comment

gmarkall Mar 19, 2025

Choose a reason for hiding this comment

brandon-b-miller Mar 19, 2025

Choose a reason for hiding this comment

gmarkall commented Mar 19, 2025

Use `cuda.bindings` and `cuda.core` for `Linker` #133

Use `cuda.bindings` and `cuda.core` for `Linker` #133