Skip to content

AWS GPU Configuration #868

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from
Draft

AWS GPU Configuration #868

wants to merge 2 commits into from

Conversation

hameerabbasi
Copy link
Collaborator

@hameerabbasi hameerabbasi commented May 5, 2025

This PR adds support for the following features (only COO):

  • Elemwise (if supported by CuPy)
  • Sum along an axis
  • Matmul
  • Converting to/from cupyx.scipy.sparse matrices.
  • to_device with "cpu" and CuPy devices, only stream=None,
  • Constructing from CuPy arrays.

@hameerabbasi hameerabbasi force-pushed the aws-gpu branch 2 times, most recently from b99e7a5 to 185c956 Compare May 5, 2025 08:14
Copy link

codspeed-hq bot commented May 5, 2025

CodSpeed Performance Report

Merging #868 will degrade performances by 97.81%

Comparing aws-gpu (55ae3bc) with main (afb5212)

Summary

⚡ 10 improvements
❌ 151 regressions
✅ 179 untouched benchmarks

⚠️ Please fix the performance issues or acknowledge them on CodSpeed.

Benchmarks breakdown

Benchmark BASE HEAD Change
test_elemwise[side=100-rank=1-format='coo'-add] 2.9 ms 3.7 ms -21.3%
test_elemwise[side=100-rank=1-format='coo'-mul] 2.2 ms 2.7 ms -20.45%
test_elemwise[side=100-rank=1-format='gcxs'-add] 3.4 ms 4.6 ms -26.65%
test_elemwise[side=100-rank=1-format='gcxs'-mul] 2.7 ms 3.7 ms -27.45%
test_elemwise[side=100-rank=2-format='coo'-add] 3.3 ms 4.1 ms -19.78%
test_elemwise[side=100-rank=2-format='coo'-mul] 2.4 ms 2.9 ms -17.56%
test_elemwise[side=100-rank=2-format='gcxs'-add] 6.7 ms 7.7 ms -13.02%
test_elemwise[side=100-rank=2-format='gcxs'-mul] 5.8 ms 6.5 ms -10.95%
test_elemwise[side=1000-rank=1-format='coo'-add] 2.9 ms 3.8 ms -22.4%
test_elemwise[side=1000-rank=1-format='coo'-mul] 2.2 ms 2.7 ms -20.34%
test_elemwise[side=1000-rank=1-format='gcxs'-add] 3.4 ms 4.7 ms -27.39%
test_elemwise[side=1000-rank=1-format='gcxs'-mul] 2.7 ms 3.7 ms -27.25%
test_elemwise[side=500-rank=1-format='coo'-add] 2.9 ms 3.8 ms -22.45%
test_elemwise[side=500-rank=1-format='coo'-mul] 2.2 ms 2.7 ms -20.39%
test_elemwise[side=500-rank=1-format='gcxs'-add] 3.4 ms 4.7 ms -27.43%
test_elemwise[side=500-rank=1-format='gcxs'-mul] 2.7 ms 3.7 ms -27.29%
test_elemwise[side=500-rank=2-format='coo'-add] 7.1 ms 8 ms -10.33%
test_elemwise[side=500-rank=2-format='coo'-mul] 3.9 ms 4.4 ms -11.76%
test_elemwise_broadcast[side=100-format='coo'-mul] 2.6 ms 3.2 ms -18.38%
test_elemwise_broadcast[side=100-format='gcxs'-mul] 6.4 ms 7.4 ms -13.71%
... ... ... ... ...

ℹ️ Only the first 20 benchmarks are displayed. Go to the app to view all benchmarks.

@hameerabbasi hameerabbasi force-pushed the aws-gpu branch 4 times, most recently from 4e69a90 to ae8e04f Compare May 7, 2025 11:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant