Search code, repositories, users, issues, pull requests...

Clear

Search syntax tips

Provide feedback

We read every piece of feedback, and take your input very seriously.

Include my email address so I can be contacted

Saved searches

Use saved searches to filter your results more quickly

Name

Query

To see all available qualifiers, see our documentation.

You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session.

Dismiss alert

DefTruth / Awesome-LLM-Inference Public

Notifications You must be signed in to change notification settings
Fork 174
Star 2.6k

Code
Issues
Pull requests
Actions
Projects
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Actions
Projects
Security
Insights

Releases: DefTruth/Awesome-LLM-Inference

Releases Tags

Releases · DefTruth/Awesome-LLM-Inference

v1.6

23 Jul 01:09

DefTruth

v1.6

a186334

This commit was created on GitHub.com and signed with GitHub’s verified signature.

GPG key ID: B5690EEEBB952194

Learn about vigilant mode.

Compare

Choose a tag to compare

View all tags

v1.6

Full Changelog: v1.5...v1.6

Assets 2

All reactions

v1.5

15 Jul 01:23

DefTruth

v1.5

7e9c309

This commit was created on GitHub.com and signed with GitHub’s verified signature.

GPG key ID: B5690EEEBB952194

Learn about vigilant mode.

Compare

Choose a tag to compare

View all tags

v1.5

What's Changed

add MInference 1.0 from microsoft by @liyucheng09 in #20

Full Changelog: v1.3...v1.5

Contributors

liyucheng09

Assets 2

All reactions

v1.3

08 Jul 01:38

DefTruth

v1.3

353867a

This commit was created on GitHub.com and signed with GitHub’s verified signature.

GPG key ID: B5690EEEBB952194

Learn about vigilant mode.

Compare

Choose a tag to compare

View all tags

v1.3

What's Changed

[MoA] MoA: Mixture of Sparse Attention for Automatic LLM Compression by @liyucheng09 in #19

Full Changelog: v1.2...v1.3

Contributors

liyucheng09

Assets 2

All reactions

v1.2

20 Jun 01:12

DefTruth

v1.2

33833fb

This commit was created on GitHub.com and signed with GitHub’s verified signature.

GPG key ID: B5690EEEBB952194

Learn about vigilant mode.

Compare

Choose a tag to compare

View all tags

v1.2

What's Changed

Update README.md by @Kthyeon in #18

New Contributors

@Kthyeon made their first contribution in #18

Full Changelog: v1.1...v1.2

Contributors

Kthyeon

Assets 2

All reactions

v1.1

12 Jun 01:11

DefTruth

v1.1

edfe64e

This commit was created on GitHub.com and signed with GitHub’s verified signature.

GPG key ID: B5690EEEBB952194

Learn about vigilant mode.

Compare

Choose a tag to compare

View all tags

v1.1

Full Changelog: v1.0...v1.1

Assets 2

jeejeelee and ItsAbdula reacted with thumbs up emoji

All reactions

👍 2 reactions

2 people reacted

v1.0

01 Jun 09:24

DefTruth

v1.0

2962605

This commit was created on GitHub.com and signed with GitHub’s verified signature.

GPG key ID: B5690EEEBB952194

Learn about vigilant mode.

Compare

Choose a tag to compare

View all tags

v1.0

Full Changelog: v0.9...v1.0

Assets 2

All reactions

v0.9

27 May 02:19

DefTruth

v0.9

ab989ba

This commit was created on GitHub.com and signed with GitHub’s verified signature.

GPG key ID: B5690EEEBB952194

Learn about vigilant mode.

Compare

Choose a tag to compare

View all tags

v0.9

What's Changed

update [Decoding Speculative Decoding] github repo by @KylinC in #16

New Contributors

@KylinC made their first contribution in #16

Full Changelog: v0.8...v0.9

Contributors

KylinC

Assets 2

All reactions

v0.8

09 May 02:10

DefTruth

v0.8

d1da65d

This commit was created on GitHub.com and signed with GitHub’s verified signature.

GPG key ID: B5690EEEBB952194

Learn about vigilant mode.

Compare

Choose a tag to compare

View all tags

v0.8

Full Changelog: v0.7...v0.8

Assets 2

All reactions

v0.7

27 Apr 06:37

DefTruth

v0.7

754ec24

This commit was created on GitHub.com and signed with GitHub’s verified signature.

GPG key ID: B5690EEEBB952194

Learn about vigilant mode.

Compare

Choose a tag to compare

View all tags

v0.7

What's Changed

LLMLingua-2 by @liyucheng09 in #11
add SnapKV by @liyucheng09 in #12
Add Microbenchmark by @Miroier in #14
[KVcache] add "Gear" paper and code of "Keyformer" by @HarryWu-CHN in #13
Update README.md by @preminstrel in #15

New Contributors

@Miroier made their first contribution in #14
@HarryWu-CHN made their first contribution in #13
@preminstrel made their first contribution in #15

Full Changelog: v0.6...v0.7

Contributors

liyucheng09, Miroier, and 2 other contributors

Assets 2

DefTruth reacted with eyes emoji

All reactions

👀 1 reaction

1 person reacted

Awesome-LLM-Inference v0.6

14 Apr 06:28

DefTruth

v0.6

25cbc41

This commit was created on GitHub.com and signed with GitHub’s verified signature.

GPG key ID: B5690EEEBB952194

Learn about vigilant mode.

Compare

Choose a tag to compare

View all tags

Awesome-LLM-Inference v0.6

What's Changed

Add an ICLR paper for KV cache compression by @Janghyun1230 in #8
Add github link for paper FP8-Quantization[2208.09225] by @Mr-Philo in #9

New Contributors

@Janghyun1230 made their first contribution in #8
@Mr-Philo made their first contribution in #9

Full Changelog: v0.5...v0.6

Contributors

Janghyun1230 and Mr-Philo

Assets 2

All reactions

Previous 1 2 3 Next

Previous Next

Footer

Footer navigation

Terms
Privacy
Security
Status
Docs
Contact

You can’t perform that action at this time.