Release v0.2.0 Release Note · linkedin/Liger-Kernel

Opening Thoughts 🫶

Thank You!

We'd love to take this chance to express our sincere gratefulness to the community! 2500+ ⭐ , 10+ new contributors, 50+ PRs, plus integration into Hugging Face 🤗, axolotl and LLaMA-Factory in less than one week since going open sourced is totally beyond our expectation. Being able to work together with all the cool people in the community is a bliss and we can't wait for further collaborations down the road!

Looking Ahead

We look forward to further enhancing our collaboration with the community, to work together on a lot of cool stuff -- support for more model families, squeeze out all optimization opportunities for kernels, and, why not, llama.triton? 😉

Get Involved and Stay Tuned

Please feel free to join our discord channel hosted in CUDA MODE server, and follow our repo's official account on X: https://x.com/liger_kernel !

Welcome Phi3 and Qwen2 🚀

This release ships with support for other popular models including Phi3 and Qwen2. All existing kernels in Liger repo can be leveraged to boost your training with models from these families now. Please refer to our API guide for how to use.

Even Easier API ❤️

Experimenting with different model families and tired of having if-else everywhere just to switch between kernel patching functions? You can now try out our new model-agnostic API to apply Liger kernels. Still a one-liner, but more elegant :) For example:

from liger_kernel.transformers import AutoLigerKernelForCausalLM

# This AutoModel wrapper class automatically monkey-patches the
# model with the optimized Liger kernels if the model is supported.
model = AutoLigerKernelForCausalLM.from_pretrained(...)

More Features

Support optional bias term in FusedLinearCrossEntropy (#144)
Mistral is now equipped with the humongous memory reduction from FusedLinearCrossEntropy now (#93)
Gemma is now equipped with the humongous memory reduction from FusedLinearCrossEntropy now (#111)

Bug Fixes

Fixed import error when using triton>=3.0.0 on NGC containers (#79)
Fixed the missing offset in Gemma RMSNorm (#85) oops
Added back missing dataclass entries in efficiency callback (#116)
There was some confusion on which Gemma do we support, we now support all! (#125)
Fallback to torch native linear + CrossEntropy when without label (#128)
Match the exact dtype up and downcasting in Llama & Gemma for RMSNorm (#92)
Address the bug that RoPE gets very slow when using dynamic sequence length (#149)

What's Changed

Updated test tolerances for H100 by @shimizust in #55
Update README.md by @lancerts in #58
Update benchmark result of Medusa for batch size = 6 setup by @JasonZhu1313 in #59
Add star graph by @shivam15s in #60
Add monkey patch for Qwen2 models by @chiwanpark in #69
Add pytest and datasets to dev dependencies by @chiwanpark in #68
Fix typos by @pchng in #77
Remove unused images in examples/medusa/docs/images/ by @pchng in #78
chore: update cross_entropy.py by @eltociear in #84
Fix incorrect import for triton 3 by @arvindsun in #79
update install from source guide by @yundai424 in #86
Fix Gemma RMSNorm by @davidgonmar in #85
Fix example bugs by @qingquansong in #88
Make tests passing on AMD GPU with 24GB ram by @helloworld1 in #90
modified: README.md by @leaf-soba in #91
pytest without need to dealing with PYTHONPATH by @helloworld1 in #95
Update test_cross_entropy.py by @lancerts in #94
Add FusedLinerCrossEntropy support for Mistral by @Tcc0403 in #93
Remove duplicate images by @qingquansong in #107
Add Qwen benchmarks by @shivam15s in #108
Fix Mixtral typo by @Tcc0403 in #109
Explicitly add dependencies in req.txt for medusa example by @JasonZhu1313 in #110
Add convergence tests and trainer integration test for Qwen2 by @Tcc0403 in #105
[Bug fix] Efficiency callback missing dataclass entries by @tyler-romero in #116
Monkeypatch for Phi3 by @tyler-romero in #76
Add FusedLinearCrossEntropy to Gemma by @Luke-Chesley in #111
Makefile command for env-report by @tyler-romero in #114
[WIP] Fix confusion on Gemma by @yundai424 in #121
[tiny] reformat code by @tyler-romero in #122
Revert "[WIP] Fix confusion on Gemma (#121)" by @yundai424 in #123
Fix gemma 1 and 2 support by @yundai424 in #125
Adding AutoLigerKernelForCausalLM by @shimizust in #115
fallback to torch native linear+CE when without label by @yundai424 in #128
Add code to save medusa heads and model by @JasonZhu1313 in #130
Add FusedLinerCrossEntropy support for Phi3 by @tyler-romero in #103
Add GPU CI support by @helloworld1 in #134
Make GPU CI optional until it is more stable by @helloworld1 in #141
Add gemma lightning example for single L40 GPU by @qingquansong in #120
feat: correct casts in RMSNorm to match references by @davidgonmar in #92
Bias for fused linear cross entropy by @davidgonmar in #144
Rerun FLCE benchmark after bias added by @ByronHsu in #148
updated sl to be non-constexpr by @AndreSlavescu in #149
update readme to use absolute paths by @shaoruu in #157
fix convergence test, phi3 import and update benchmark by @yundai424 in #155
bump lowest HF version by @yundai424 in #158
Add missing tf_keras to req.txt by @JasonZhu1313 in #161
Re-enable GPU CI enforce by @helloworld1 in #142
Bump package ver by @yundai424 in #163
Update version in setup.py to 0.2.0 by @yundai424 in #164

New Contributors

@chiwanpark made their first contribution in #69
@pchng made their first contribution in #77
@eltociear made their first contribution in #84
@arvindsun made their first contribution in #79
@davidgonmar made their first contribution in #85
@leaf-soba made their first contribution in #91
@Tcc0403 made their first contribution in #93
@tyler-romero made their first contribution in #116
@Luke-Chesley made their first contribution in #111
@AndreSlavescu made their first contribution in #149
@shaoruu made their first contribution in #157

Full Changelog: v0.1.1...v0.2.0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.2.0 Release Note