Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[QUESTION] Array of sparse matrix? #366

Open
Qiuany opened this issue Nov 24, 2024 · 3 comments
Open

[QUESTION] Array of sparse matrix? #366

Qiuany opened this issue Nov 24, 2024 · 3 comments
Assignees
Labels
question The issue author requires information

Comments

@Qiuany
Copy link

Qiuany commented Nov 24, 2024

Hello,

I’m wondering if Warp supports array of sparse matrices, particularly for parallelized sparse matrix multiplication. If not, are there any plans to add this feature, or any suggestions for implementing it?

Thank you for your help!

@Qiuany Qiuany added the question The issue author requires information label Nov 24, 2024
@mmacklin
Copy link
Collaborator

Hi @Qiuany , do I understand correctly that you would want batched sparse SGEMM for example?

Would problem sizes be constant through the batch?

While we don't have anything for this specifically I think you could roll your own fairly easily, at least for a naive implementation you can assign one thread per-output and simply write out the CSR multiplication. For a more efficient implementation please look out for the tile primitives coming in the 1.5.0 release which would allow you to use e.g.: one block of threads per-row.

Thanks,
Miles

@Qiuany
Copy link
Author

Qiuany commented Nov 25, 2024

Thank you for your quick reply.

I would like to handle sparse matrices in a similar way to passing wp.array(dtype=wp.mat33d) into a kernel and performing matrix operations on one matrix per thread. All matrices involved in the computation are of the same size. It seems that Warp currently supports arrays of scalars, vectors, matrices, and user-implemented structs, but does not directly support sparse matrices. Does this mean I need to implement a CSR matrix struct and the corresponding computation functions in Warp myself?

I am not very familiar with the details of CSR-related computations. Since Warp kernels do not allow the creation of arrays dynamically, does this imply that the rows, columns, and values in the CSR struct must be stored as vectors? Additionally, because the sparsity of matrices is not predetermined, it seems the lengths of these vectors can only be determined at runtime. Is my understanding correct?

Thanks,
Qiuan

@mmacklin
Copy link
Collaborator

Warp does actually support for BSR (block-sparse which is a superset of CSR) matrices:

https://nvidia.github.io/warp/modules/sparse.html

But yes, these types of sparsity formats are designed for 'runtime' sparsity with typically one large matrix at a time, not many small sparse matrices.

What kind of dimensions would you want for your case? And what kind of sparsity pattern / nnz would you expect? All these things would have quite a big impact on how to best implement sparse matrix support.

In general, sparsity probably makes the most sense when you can leverage sparse TensorCore instructions, which are available for certain datatypes / sparsity patterns (https://developer.nvidia.com/blog/accelerating-inference-with-sparsity-using-ampere-and-tensorrt/). It's not something we're exposing yet, but I would be interested in more details to understand what is required.

Cheers,
Miles

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question The issue author requires information
Projects
None yet
Development

No branches or pull requests

2 participants