[FEA] Support automatic generation of `link` list for external code #67

gmarkall · 2024-11-06T21:51:01Z

Numba-cuda extensions (e.g. nvmath-python) are frequently leaning on CUDA C++ implementations to support the core of their functionality.

One current UX limitation is that the kernel author is required to add the list of files and/or code to link with a kernel as a keyword argument to the @cuda.jit decorator, for example:

@cuda.jit(link=FFT.files)
def f(data):
    ...

from cufftdx_simple_fft_block.py

The FFT object supplies the files, and is created like:

FFT = fft(fft_type='c2c', 
          size=128,
          precision=np.float32,
          direction='forward',
          elements_per_thread=8,
          ffts_per_block=2,
          execution='Block',
          compiler='numba')

and is called inside the kernel as:

FFT(thread_data, shared_mem)

Rather than the user being required to link FFT.files, Numba should provide a mechanism to obtain and link the list of files / code (LTO-IR, PTX, CUDA C/C++ source, or binaries / objects etc.) at the point of compilation and linking from the FFT object (or any implementation of a method, property, object, etc. backed by an extension). It is expected that the implementation (of FFT, in this example) may generate code (e.g. LTO-IR) at this point just prior to returning it back to Numba.

The text was updated successfully, but these errors were encountered:

leofang · 2024-11-08T15:58:38Z

Perhaps ask all device functions to implement a ~~method~~ attribute, say, __numba_cuda_link__ that returns a list of files, if they want numba-cuda to handle the linking?

gmarkall · 2024-11-08T16:03:32Z

Continuing with the __numba_cuda_link__ idea, I think it might need to be a method that can accept a signature, so that it can return the appropriate files for the given signature.

leofang · 2024-11-08T16:07:38Z

How should the kernel author pass function arguments at the call site if it is a method not attribute?

gmarkall · 2024-11-08T18:07:01Z

In the example above, the kernel author wrote:

FFT(thread_data, shared_mem)

assuming the Numba types of these are float32[:] and float32[::1] (for the sake of argument, they could be any Numba type really) I'd expect during compilation time that Numba would be doing the equivalent of calling

ltoir = FFT.__numba_cuda_link__(float32[:], float32[::1])

where ltoir is then an LTOIR linkable code object, i.e. an instance of

numba-cuda/numba_cuda/numba/cuda/cudadrv/linkable_code.py

Lines 59 to 63 in e5f2ef9

    
           class LTOIR(LinkableCode): 
        
               """An LTOIR file in memory""" 
        
               kind = "ltoir" 
        
               default_name = "<unnamed-ltoir>"

gmarkall added the feature request New feature or request label Nov 6, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FEA] Support automatic generation of `link` list for external code #67

[FEA] Support automatic generation of `link` list for external code #67

gmarkall commented Nov 6, 2024 •

edited

Loading

leofang commented Nov 8, 2024 •

edited

Loading

gmarkall commented Nov 8, 2024

leofang commented Nov 8, 2024

gmarkall commented Nov 8, 2024

[FEA] Support automatic generation of link list for external code #67

[FEA] Support automatic generation of link list for external code #67

Comments

gmarkall commented Nov 6, 2024 • edited Loading

leofang commented Nov 8, 2024 • edited Loading

gmarkall commented Nov 8, 2024

leofang commented Nov 8, 2024

gmarkall commented Nov 8, 2024

[FEA] Support automatic generation of `link` list for external code #67

[FEA] Support automatic generation of `link` list for external code #67

gmarkall commented Nov 6, 2024 •

edited

Loading

leofang commented Nov 8, 2024 •

edited

Loading