Adapt Splash Attention from TorchPrime #8911

zpcore · 2025-03-31T22:05:24Z

Adapt the PR AI-Hypercomputer/torchprime#145 from TorchPrime into PTXLA. Also simplified the code to use jit hashing from #8878.

In addition, fix a small bug in xla_builder.call_jax when the input arg contains both None and other hashable types in sequence.

tengyifei

Nice!

torch_xla/experimental/custom_kernel_from_jax.py

test/test_splash_attention_jax.py

test/tpu/run_tests.sh

torch_xla/experimental/custom_kernel_from_jax.py

tengyifei · 2025-04-09T00:19:07Z

Looks like some comments still need to be addressed -- LMK whenever I should TAL!

zpcore · 2025-04-09T06:23:48Z

Looks like some comments still need to be addressed -- LMK whenever I should TAL!

Yes, I am working on getting rid of the lru_cache. Need to fix some small issues before resolving the feedback. Thanks!

torch_xla/experimental/splash_attention.py

zpcore · 2025-04-10T06:29:21Z

Oh, interesting that the test failed for the cache miss count. Looks like the HLO cache can be reused between test functions.

zpcore · 2025-04-11T00:15:00Z

Hi @tengyifei , I created issue #8963 to track the hashing issue. Will follow up in a separate PR for the fix.

tengyifei

SGTM / LGTM

test/test_splash_attention.py

zpcore force-pushed the piz/port_sa branch from b5f9c51 to 90eabe6 Compare April 5, 2025 00:21

zpcore marked this pull request as ready for review April 5, 2025 21:00

zpcore requested a review from tengyifei April 5, 2025 21:00

zpcore enabled auto-merge (squash) April 5, 2025 23:28

tengyifei requested changes Apr 5, 2025

View reviewed changes

tengyifei requested a review from bhavya01 April 8, 2025 21:29

zpcore force-pushed the piz/port_sa branch from 74cb045 to 165e268 Compare April 10, 2025 05:08

tengyifei approved these changes Apr 10, 2025

View reviewed changes

torch_xla/experimental/splash_attention.py Show resolved Hide resolved

zpcore added 14 commits April 10, 2025 23:26

Adapt Splash Attention from TorchPrime

7aaa2d5

specify jax platform

0de8e5d

put flash attention into setup

200bcf7

refine test

2622915

retain grad

9294176

retain grad

b8dd586

bring test back

92f33ed

update based on feedback

e1bf5f7

remove lru_cahce

cd08eb0

nit

76b8ea9

use SplashAttentionConfig instead of str

89ee306

merge conflict

55adb88

nit

7fa9dec

rebase

93114ba

zpcore force-pushed the piz/port_sa branch from 253ab06 to 93114ba Compare April 10, 2025 23:30

tengyifei approved these changes Apr 11, 2025

View reviewed changes

test/test_splash_attention.py Outdated Show resolved Hide resolved

test/test_splash_attention.py Show resolved Hide resolved

zpcore added 2 commits April 11, 2025 00:41

update obselete naming

6af3431

retain grad

3a9d3f3

zpcore merged commit 4583051 into master Apr 11, 2025
23 of 24 checks passed

zpcore deleted the piz/port_sa branch April 11, 2025 16:41

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adapt Splash Attention from TorchPrime #8911

Adapt Splash Attention from TorchPrime #8911

zpcore commented Mar 31, 2025 •

edited

Loading

tengyifei left a comment

tengyifei commented Apr 9, 2025

zpcore commented Apr 9, 2025

zpcore commented Apr 10, 2025

zpcore commented Apr 11, 2025

tengyifei left a comment

Adapt Splash Attention from TorchPrime #8911

Adapt Splash Attention from TorchPrime #8911

Conversation

zpcore commented Mar 31, 2025 • edited Loading

tengyifei left a comment

Choose a reason for hiding this comment

tengyifei commented Apr 9, 2025

zpcore commented Apr 9, 2025

zpcore commented Apr 10, 2025

zpcore commented Apr 11, 2025

tengyifei left a comment

Choose a reason for hiding this comment

zpcore commented Mar 31, 2025 •

edited

Loading