From e20233d361b4e6a7cb8e37c6d7f85e9900527802 Mon Sep 17 00:00:00 2001 From: Woosuk Kwon Date: Tue, 13 Aug 2024 01:37:08 -0700 Subject: [PATCH] Revert "[Doc] Update supported_hardware.rst (#7276)" (#7467) --- .../quantization/supported_hardware.rst | 28 +++++++++---------- 1 file changed, 13 insertions(+), 15 deletions(-) diff --git a/docs/source/quantization/supported_hardware.rst b/docs/source/quantization/supported_hardware.rst index bb41bfed342c6..ecc330d866dbd 100644 --- a/docs/source/quantization/supported_hardware.rst +++ b/docs/source/quantization/supported_hardware.rst @@ -5,20 +5,18 @@ Supported Hardware for Quantization Kernels The table below shows the compatibility of various quantization implementations with different hardware platforms in vLLM: -===================== ====== ======= ======= ===== ====== ======= ========= ======= ============== ========== -Implementation Volta Turing Ampere Ada Hopper AMD GPU Intel GPU x86 CPU AWS Inferentia Google TPU -===================== ====== ======= ======= ===== ====== ======= ========= ======= ============== ========== -AWQ ❌ ✅ ✅ ✅ ✅ ❌ ❌ ❌ ❌ ❌ -GPTQ ✅ ✅ ✅ ✅ ✅ ❌ ❌ ❌ ❌ ❌ -Marlin (GPTQ/AWQ/FP8) ❌ ❌ ✅ ✅ ✅ ❌ ❌ ❌ ❌ ❌ -INT8 (W8A8) ❌ ✅ ✅ ✅ ✅ ❌ ❌ ❌ ❌ ❌ -FP8 (W8A8) ❌ ❌ ❌ ✅ ✅ ❌ ❌ ❌ ❌ ❌ -AQLM ✅ ✅ ✅ ✅ ✅ ❌ ❌ ❌ ❌ ❌ -bitsandbytes ✅ ✅ ✅ ✅ ✅ ❌ ❌ ❌ ❌ ❌ -DeepSpeedFP ✅ ✅ ✅ ✅ ✅ ❌ ❌ ❌ ❌ ❌ -GGUF ✅ ✅ ✅ ✅ ✅ ❌ ❌ ❌ ❌ ❌ -SqueezeLLM ✅ ✅ ✅ ✅ ✅ ❌ ❌ ❌ ❌ ❌ -===================== ====== ======= ======= ===== ====== ======= ========= ======= ============== ========== +============== ====== ======= ======= ===== ====== ======= ========= ======= ============== ========== +Implementation Volta Turing Ampere Ada Hopper AMD GPU Intel GPU x86 CPU AWS Inferentia Google TPU +============== ====== ======= ======= ===== ====== ======= ========= ======= ============== ========== +AQLM ✅ ✅ ✅ ✅ ✅ ❌ ❌ ❌ ❌ ❌ +AWQ ❌ ✅ ✅ ✅ ✅ ❌ ❌ ❌ ❌ ❌ +DeepSpeedFP ✅ ✅ ✅ ✅ ✅ ❌ ❌ ❌ ❌ ❌ +FP8 ❌ ❌ ✅ ✅ ✅ ❌ ❌ ❌ ❌ ❌ +Marlin ❌ ❌ ✅ ✅ ✅ ❌ ❌ ❌ ❌ ❌ +GPTQ ✅ ✅ ✅ ✅ ✅ ❌ ❌ ❌ ❌ ❌ +SqueezeLLM ✅ ✅ ✅ ✅ ✅ ❌ ❌ ❌ ❌ ❌ +bitsandbytes ✅ ✅ ✅ ✅ ✅ ❌ ❌ ❌ ❌ ❌ +============== ====== ======= ======= ===== ====== ======= ========= ======= ============== ========== Notes: ^^^^^^ @@ -29,4 +27,4 @@ Notes: Please note that this compatibility chart may be subject to change as vLLM continues to evolve and expand its support for different hardware platforms and quantization methods. -For the most up-to-date information on hardware support and quantization methods, please check the `quantization directory `_ or consult with the vLLM development team. +For the most up-to-date information on hardware support and quantization methods, please check the `quantization directory `_ or consult with the vLLM development team. \ No newline at end of file