Skip to content

Feat: Adding Linear Algebra Dot operation support #116

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 25 commits into
base: main
Choose a base branch
from

Conversation

SwayamInSync
Copy link
Collaborator

This PR contributes as follows:

  • Ship the dot method within package that supports following operations
    • vector-vector dot product
    • matrix-vector dot product
    • matrix-matrix multiplication
  • Optimized Linear Algebra ops supported by the QBLAS on x86-64 Linux and ARM machine. On windows it fallbacks to naive implementation due to QBLAS incompatibility with MSVC
  • Test Suite to validate dot products between inputs

Images below are the performance comparison

quadblas-x86-64-96-cores

Machine: x86-64 Linux with 96 cores

quadblas-ARM-8-core

Machine: MacOS-Silicon (ARM) with 8 cores

To compile without QBLAS set DISABLE_QUADBLAS as CFLAGS and CXXFLAGS

@SwayamInSync SwayamInSync requested a review from ngoldbaum July 11, 2025 08:19
@SwayamInSync
Copy link
Collaborator Author

Ahh forgot to update the general CI, or should we remove the quaddtype from there, given that build_wheels.yml has those same checks in its CI with more strict and on all platforms

@SwayamInSync SwayamInSync self-assigned this Jul 11, 2025
@juntyr
Copy link
Contributor

juntyr commented Jul 11, 2025

Would the new functionality only be accessed through the dot function or is there a way to call that automatically when using numpy dot, matmul, etc?

@ngoldbaum
Copy link
Member

Unfortunately no, not easily. We'd need to add a new dtype hook to the DType API in NumPy. Worth doing though! See numpy/numpy#28516 which adds a hook for sorting.

@ngoldbaum
Copy link
Member

(unless I'm missing an existing hook - @seberg might know better)

@seberg
Copy link
Member

seberg commented Jul 11, 2025

I think the most interesting thing is actually @ which goes via matmul which is a ufunc (like vecdot, vecmat, matvec). And adding ufuncs is straight forward of course!

The old style ArrFuncs also has a PyArray_DotFunc, but it is relatively useless in practice, because it implements a vector dot. arr.dot(arr2) will use it but just end up being slow, I suspect.

(One fun thing is, as a "work-around" you could try to implement that PyArray_DotFunc and possibly raise an error to use the ufunc instead. -- although the function would be called many times.)

Adding new slots similar to the sort way makes sense. Although, I think for arr.dot(arr2) it may be more relevant to see if doesn't make sense to change it to call into matmul?
(Which may need some care, because I am not sure what the current paths are for objects, i.e. if those call back into OBJECT_dot, but I don't think so.)

@SwayamInSync
Copy link
Collaborator Author

I see, so if anyways we create a slot, the implementation need to be in done for the quaddtype.
Now with @seberg 's comment I remember trying NPY_DT_PyArray_ArrFuncs_dotfunc but it also wasn't supported.

The current dot implementation (shipped with package) handles the vector-vector, gemv or gemm as per the input dimensions.

It would be good to have a slot parallel to np.dot (like it also picks the operation based on input dimensions)

@SwayamInSync
Copy link
Collaborator Author

Ahh forgot to update the general CI, or should we remove the quaddtype from there, given that build_wheels.yml has those same checks in its CI with more strict and on all platforms

@ngoldbaum thoughts on this?

@seberg
Copy link
Member

seberg commented Jul 11, 2025

It would be good to have a slot parallel to np.dot (like it also picks the operation based on input dimensions)

Actually, I think we should probably not do this, unless arr.dot() really can't call into one of the existing ufuncs. I.e. yes, I am all for new slots, but if ufuncs work, that is better (even a new ufunc is likely better)!
(It does behave slightly differently, but I suspect one can map it.)

And the first step, is to implement those ufuncs, I think.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants