Skip to content

Commit d621cf6

Browse files
committed
Enable bitsandbytes quantization on warp size 32 AMD GPUs
1 parent 6c728f7 commit d621cf6

File tree

1 file changed

+3
-0
lines changed

1 file changed

+3
-0
lines changed

vllm/platforms/rocm.py

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -199,6 +199,9 @@ class RocmPlatform(Platform):
199199
"petit_nvfp4",
200200
"torchao",
201201
]
202+
# bitsandbytes is not supported on GPUs with warp size 64 (gfx9)
203+
if not on_gfx9():
204+
supported_quantization += ["bitsandbytes"]
202205

203206
@classmethod
204207
def get_vit_attn_backend(cls, head_size: int, dtype: torch.dtype) -> "_Backend":

0 commit comments

Comments
 (0)