Skip to content

using syrk for performing special cases of matrix multiplication #2509

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 17 commits into from
Jul 12, 2025
Merged

Conversation

vtavana
Copy link
Collaborator

@vtavana vtavana commented Jul 1, 2025

In this PR, the syrk routines from oneMKL is used to perform a rank-k update which is used for a specialized matrix multiplication where the result is a symmetric matrix.

$ sycl-ls
[level_zero:gpu][level_zero:0] Intel(R) oneAPI Unified Runtime over Level-Zero, Intel(R) Data Center GPU Max 1100 12.60.7 [1.6.31294+9]
[opencl:cpu][opencl:0] Intel(R) OpenCL, Intel(R) Xeon(R) Platinum 8480+ OpenCL 3.0 (Build 0) [2025.19.1.0.16_160000]
[opencl:gpu][opencl:1] Intel(R) OpenCL Graphics, Intel(R) Data Center GPU Max 1100 OpenCL 3.0 NEO  [24.39.31294]
size syrk-GPU gemm-GPU syrk-CPU gemm-CPU
2048 1.37 ms ± 2.81 μs 2.02 ms ± 3.69 μs 17.5 ms ± 1.21 ms 12 ms ± 365 μs
4096 7.94 ms ± 16.6 μs 14.2 ms ± 55.9 μs 73.2 ms ± 4.94 ms 68.2 ms ± 5.93 ms
8192 59 ms ± 144 μs 107 ms ± 126 μs 481 ms ± 37.1 ms 674 ms ± 56.5 ms

Time measurement using:

import dpnp
size = 4096
ia = dpnp.ones((size, 2*size), device="cpu")
%timeit r = dpnp.matmul(ia, ia.mT); r.sycl_queue.wait()

Also see timing measured here.

  • Have you provided a meaningful PR description?
  • Have you added a test, reproducer or referred to an issue with a reproducer?
  • Have you tested your changes locally for CPU and GPU devices?
  • Have you made sure that new changes do not introduce compiler warnings?
  • Have you checked performance impact of proposed changes?
  • Have you added documentation for your changes, if necessary?
  • Have you added your changes to the changelog?

Copy link
Contributor

github-actions bot commented Jul 1, 2025

View rendered docs @ https://intelpython.github.io/dpnp/index.html

Copy link
Contributor

github-actions bot commented Jul 1, 2025

Array API standard conformance tests for dpnp=0.19.0dev1=py313h509198e_24 ran successfully.
Passed: 1227
Failed: 0
Skipped: 9

@coveralls
Copy link
Collaborator

coveralls commented Jul 2, 2025

Coverage Status

coverage: 72.044% (-0.08%) from 72.128%
when pulling a45a747 on syrk
into 1a7ce22 on master.

@vtavana vtavana marked this pull request as ready for review July 10, 2025 00:02
Copy link
Contributor

@antonwolfy antonwolfy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @vtavana, it looks as a great improvement. No more review comments from me.

@vtavana vtavana merged commit afd5c6d into master Jul 12, 2025
91 of 98 checks passed
@vtavana vtavana deleted the syrk branch July 12, 2025 17:26
github-actions bot added a commit that referenced this pull request Jul 12, 2025
…2509)

In this PR, the `syrk` routines from oneMKL is used to perform a rank-k
update which is used for a specialized matrix multiplication where the
result is a symmetric matrix. afd5c6d
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants