Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FlashAttention1.0.9 error in 2080Ti #40

Open
miraclewkf opened this issue Mar 15, 2024 · 1 comment
Open

FlashAttention1.0.9 error in 2080Ti #40

miraclewkf opened this issue Mar 15, 2024 · 1 comment

Comments

@miraclewkf
Copy link

I use FlashAttention1.0.9 and it support Turing GPU(2080Ti),but I get the errors:
RuntimeError: FlashAttention backward for head dim > 64 requires A100 or H100 GPUs as the implementation needs a large amount of shared memory.

@huyiming2018
Copy link

The main constraint is the size of shared memory.
As the above mentions, Head dim > 64 backward requires A100 or H100. The forward for head dim <= 128, and backward for head dim <= 64 works on other GPUs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants