Skip to content

Releases: DefTruth/Awesome-LLM-Inference

v1.6

23 Jul 01:09
a186334
Compare
Choose a tag to compare

Full Changelog: v1.5...v1.6

v1.5

15 Jul 01:23
7e9c309
Compare
Choose a tag to compare

What's Changed

Full Changelog: v1.3...v1.5

v1.3

08 Jul 01:38
353867a
Compare
Choose a tag to compare

What's Changed

  • [MoA] MoA: Mixture of Sparse Attention for Automatic LLM Compression by @liyucheng09 in #19

Full Changelog: v1.2...v1.3

v1.2

20 Jun 01:12
33833fb
Compare
Choose a tag to compare

What's Changed

New Contributors

Full Changelog: v1.1...v1.2

v1.1

12 Jun 01:11
edfe64e
Compare
Choose a tag to compare

Full Changelog: v1.0...v1.1

v1.0

01 Jun 09:24
2962605
Compare
Choose a tag to compare

Full Changelog: v0.9...v1.0

v0.9

27 May 02:19
ab989ba
Compare
Choose a tag to compare

What's Changed

  • update [Decoding Speculative Decoding] github repo by @KylinC in #16

New Contributors

Full Changelog: v0.8...v0.9

v0.8

09 May 02:10
d1da65d
Compare
Choose a tag to compare

Full Changelog: v0.7...v0.8

v0.7

27 Apr 06:37
754ec24
Compare
Choose a tag to compare

What's Changed

New Contributors

  • @Miroier made their first contribution in #14
  • @HarryWu-CHN made their first contribution in #13
  • @preminstrel made their first contribution in #15

Full Changelog: v0.6...v0.7

Awesome-LLM-Inference v0.6

14 Apr 06:28
25cbc41
Compare
Choose a tag to compare

What's Changed

  • Add an ICLR paper for KV cache compression by @Janghyun1230 in #8
  • Add github link for paper FP8-Quantization[2208.09225] by @Mr-Philo in #9

New Contributors

Full Changelog: v0.5...v0.6