[Tensor Parallel] TP-AWARE DEQUANTIZATION(@IBM T.J. Watson Research C…

…enter)
DefTruth · Feb 17, 2024 · 0e8cfd4 · 0e8cfd4
1 parent f5d7460
commit 0e8cfd4
Showing 1 changed file with 4 additions and 4 deletions.
diff --git a/README.md b/README.md
@@ -59,8 +59,8 @@ Awesome-LLM-Inference: A curated list of [📙Awesome LLM Inference Papers with
 * [Mixture-of-Experts(MoE) LLM Inference](#Mixture_of_Experts_LLM_Inference)
 * [CPU/Single GPU/Mobile Inference](#CPU-Single-GPU-Inference)
 * [Non Transformer Architecture](#Non-Transformer-Architecture)
-* [GEMM、Tensor Cores、WMMA、Parallel](#GEMM-Tensor-Cores-WMMA)  
-* [Position Embed、Others](#Others)
+* [GEMM/Tensor Cores/WMMA/Parallel](#GEMM-Tensor-Cores-WMMA)  
+* [Position Embed/Others](#Others)
 
 
 ### 📖LLM Algorithmic/Eval Survey ([©️back👆🏻](#paperlist))  
@@ -249,7 +249,7 @@ Awesome-LLM-Inference: A curated list of [📙Awesome LLM Inference Papers with
 |2023.05|🔥🔥[**RWKV**] RWKV: Reinventing RNNs for the Transformer Era(@Bo Peng etc) |[[pdf]](https://arxiv.org/pdf/2305.13048.pdf)|[[RWKV-LM]](https://github.com/BlinkDL/RWKV-LM) ![](https://img.shields.io/github/stars/BlinkDL/RWKV-LM.svg?style=social)|⭐️⭐️ |          
 |2023.12|🔥🔥[**Mamba**] Mamba: Linear-Time Sequence Modeling with Selective State Spaces(@cs.cmu.edu etc) |[[pdf]](https://arxiv.org/pdf/2312.00752.pdf)|[[mamba]](https://github.com/state-spaces/mamba) ![](https://img.shields.io/github/stars/state-spaces/mamba.svg?style=social)|⭐️⭐️ |          
 
-### 📖GEMM、Tensor Cores、WMMA、Parallel ([©️back👆🏻](#paperlist))    
+### 📖GEMM/Tensor Cores/WMMA/Parallel ([©️back👆🏻](#paperlist))    
 <div id="GEMM-Tensor-Cores-WMMA"></div>  
 
 |Date|Title|Paper|Code|Recom|
@@ -260,7 +260,7 @@ Awesome-LLM-Inference: A curated list of [📙Awesome LLM Inference Papers with
 |2024.02|[QUICK] QUICK: Quantization-aware Interleaving and Conflict-free Kernel for efficient LLM inference(@SqueezeBits Inc)|[[pdf]](https://arxiv.org/pdf/2402.10076.pdf)|[[QUICK]](https://github.com/SqueezeBits/QUICK) ![](https://img.shields.io/github/stars/SqueezeBits/QUICK.svg?style=social)|⭐️⭐️ |
 |2024.02|[Tensor Parallel] TP-AWARE DEQUANTIZATION(@IBM T.J. Watson Research Center)|[[pdf]](https://arxiv.org/pdf/2402.04925.pdf)|⚠️|⭐️ | 
 
-### 📖Position Embed、Others ([©️back👆🏻](#paperlist))  
+### 📖Position Embed/Others ([©️back👆🏻](#paperlist))  
 <div id="Others"></div>  
 
 |Date|Title|Paper|Code|Recom|