Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make block, thread, and warp indices unsigned #110

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

gmarkall
Copy link
Collaborator

@gmarkall gmarkall commented Jan 9, 2025

This ports numba/numba#6112 to numba-cuda, as outlined in #109.

Note that for this patch, we don't change the type of grid() and gridsize() because these need to be 64 bit (as discovered in numba/numba#9229 and fixed in numba/numba#9235).

We need to patch the as_dtype() function, which is a little unfortunate, but there's no API for extending its behaviour at present.

Current status: in progress, as items from the original PR need to be addressed (as per this comment):

  • Audit whether type widening still occurs
  • Examine the cases of increased register usage (which is likely due to policies in the PTX -> SASS compiler rather than anything changed in numba-cuda).

We still allow unsafe promotion of tid to 64 bits, because without it, comparisons with other 64-bit operands cannot occur.

This ports numba/numba#6112 to numba-cuda, as outlined in NVIDIA#109.

Note that for this patch, we don't change the type of `grid()` and
`gridsize()` because these need to be 64 bit (as discovered in
numba/numba#9229 and fixed in numba/numba#9235).

We need to patch the `as_dtype()` function, which is a little
unfortunate, but there's no API for extending its behaviour at present.
@gmarkall gmarkall added the 2 - In Progress Currently a work in progress label Jan 9, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
2 - In Progress Currently a work in progress
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant