Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEA]: cccl.c and cuda.parallel should support indirect_iterator_t which can be advance on both host and device to support streaming algorithms #4148

Open
1 task done
oleksandr-pavlyk opened this issue Mar 14, 2025 · 0 comments
Assignees
Labels
feature request New feature or request.

Comments

@oleksandr-pavlyk
Copy link
Contributor

oleksandr-pavlyk commented Mar 14, 2025

Is this a duplicate?

Area

cuda.parallel (Python)

Is your feature request related to a problem? Please describe.

To attain optimal performance kernels for some algorithms must use 32-bit types to store problem size arguments.

Supporting these algorithms for problem sizes in excess of INT_MAX can be done with streaming approach with streaming logic encoded in algorithm's dispatcher. Dispatcher needs to increment iterators on the host.

This is presently not supported by cccl.c.parallel, since indirect_arg_t does not implement increment operator.

Since indirect_arg_t is used to represent cccl_value_t, cccl_operation_t and cccl_iterator_t, and incrementing only makes sense for iterators, a dedicated type indirect_iterator_t must be introduced, which may implement the operator+=.

If the entirety of iterator state is user-defined, cuda.parallel must provide host function pointer to increment iterator's state by compiling advance function for the host.

If we define the state of a struct that contains size_t linear_id in addition to user-defined state, we could get rid of user-defined advance function altogether, but would need to provide access to linear_id to the dereference function.

Approached need to be prototyped and compared.

Describe the solution you'd like

The solution should unblock #3764

Additional context

#3764 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request New feature or request.
Projects
Status: Todo
Development

No branches or pull requests

1 participant