From 2f1ca04e9e44aca2a1a65a16e9ffba4da3ef71a3 Mon Sep 17 00:00:00 2001 From: erfanzar Date: Thu, 24 Oct 2024 02:19:44 +0330 Subject: [PATCH] updating `README.md` --- README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/README.md b/README.md index ac6a752..f2ee489 100644 --- a/README.md +++ b/README.md @@ -97,6 +97,7 @@ attention = get_cached_flash_attention( ### Environment Variables - `FORCE_MHA`: Set to "true", "1", or "on" to force using MHA implementation even for GQA cases +- `FLASH_ATTN_BLOCK_PTR`: set to "1" to use `tl.make_block_ptr` for accessing pointer in fwd mode (better for H100/H200 GPUs) ## Performance Tips