v0.2.0 Release Note
Opening Thoughts π«Ά
Thank You!
We'd love to take this chance to express our sincere gratefulness to the community! 2500+ β , 10+ new contributors, 50+ PRs, plus integration into Hugging Face π€, axolotl and LLaMA-Factory in less than one week since going open sourced is totally beyond our expectation. Being able to work together with all the cool people in the community is a bliss and we can't wait for further collaborations down the road!
Looking Ahead
We look forward to further enhancing our collaboration with the community, to work together on a lot of cool stuff -- support for more model families, squeeze out all optimization opportunities for kernels, and, why not, llama.triton? π
Get Involved and Stay Tuned
Please feel free to join our discord channel hosted in CUDA MODE server, and follow our repo's official account on X: https://x.com/liger_kernel !
Welcome Phi3 and Qwen2 π
This release ships with support for other popular models including Phi3 and Qwen2. All existing kernels in Liger repo can be leveraged to boost your training with models from these families now. Please refer to our API guide for how to use.
Even Easier API β€οΈ
Experimenting with different model families and tired of having if-else everywhere just to switch between kernel patching functions? You can now try out our new model-agnostic API to apply Liger kernels. Still a one-liner, but more elegant :) For example:
from liger_kernel.transformers import AutoLigerKernelForCausalLM
# This AutoModel wrapper class automatically monkey-patches the
# model with the optimized Liger kernels if the model is supported.
model = AutoLigerKernelForCausalLM.from_pretrained(...)
More Features
- Support optional bias term in FusedLinearCrossEntropy (#144)
- Mistral is now equipped with the humongous memory reduction from FusedLinearCrossEntropy now (#93)
- Gemma is now equipped with the humongous memory reduction from FusedLinearCrossEntropy now (#111)
Bug Fixes
- Fixed import error when using
triton>=3.0.0
on NGC containers (#79) - Fixed the missing offset in Gemma RMSNorm (#85) oops
- Added back missing dataclass entries in efficiency callback (#116)
- There was some confusion on which Gemma do we support, we now support all! (#125)
- Fallback to torch native linear + CrossEntropy when without label (#128)
- Match the exact dtype up and downcasting in Llama & Gemma for RMSNorm (#92)
- Address the bug that RoPE gets very slow when using dynamic sequence length (#149)
What's Changed
- Updated test tolerances for H100 by @shimizust in #55
- Update README.md by @lancerts in #58
- Update benchmark result of Medusa for batch size = 6 setup by @JasonZhu1313 in #59
- Add star graph by @shivam15s in #60
- Add monkey patch for Qwen2 models by @chiwanpark in #69
- Add pytest and datasets to dev dependencies by @chiwanpark in #68
- Fix typos by @pchng in #77
- Remove unused images in
examples/medusa/docs/images/
by @pchng in #78 - chore: update cross_entropy.py by @eltociear in #84
- Fix incorrect import for triton 3 by @arvindsun in #79
- update install from source guide by @yundai424 in #86
- Fix Gemma RMSNorm by @davidgonmar in #85
- Fix example bugs by @qingquansong in #88
- Make tests passing on AMD GPU with 24GB ram by @helloworld1 in #90
- modified: README.md by @leaf-soba in #91
- pytest without need to dealing with PYTHONPATH by @helloworld1 in #95
- Update test_cross_entropy.py by @lancerts in #94
- Add FusedLinerCrossEntropy support for Mistral by @Tcc0403 in #93
- Remove duplicate images by @qingquansong in #107
- Add Qwen benchmarks by @shivam15s in #108
- Fix Mixtral typo by @Tcc0403 in #109
- Explicitly add dependencies in req.txt for medusa example by @JasonZhu1313 in #110
- Add convergence tests and trainer integration test for Qwen2 by @Tcc0403 in #105
- [Bug fix] Efficiency callback missing dataclass entries by @tyler-romero in #116
- Monkeypatch for Phi3 by @tyler-romero in #76
- Add FusedLinearCrossEntropy to Gemma by @Luke-Chesley in #111
- Makefile command for env-report by @tyler-romero in #114
- [WIP] Fix confusion on Gemma by @yundai424 in #121
- [tiny] reformat code by @tyler-romero in #122
- Revert "[WIP] Fix confusion on Gemma (#121)" by @yundai424 in #123
- Fix gemma 1 and 2 support by @yundai424 in #125
- Adding AutoLigerKernelForCausalLM by @shimizust in #115
- fallback to torch native linear+CE when without label by @yundai424 in #128
- Add code to save medusa heads and model by @JasonZhu1313 in #130
- Add FusedLinerCrossEntropy support for Phi3 by @tyler-romero in #103
- Add GPU CI support by @helloworld1 in #134
- Make GPU CI optional until it is more stable by @helloworld1 in #141
- Add gemma lightning example for single L40 GPU by @qingquansong in #120
- feat: correct casts in RMSNorm to match references by @davidgonmar in #92
- Bias for fused linear cross entropy by @davidgonmar in #144
- Rerun FLCE benchmark after bias added by @ByronHsu in #148
- updated sl to be non-constexpr by @AndreSlavescu in #149
- update readme to use absolute paths by @shaoruu in #157
- fix convergence test, phi3 import and update benchmark by @yundai424 in #155
- bump lowest HF version by @yundai424 in #158
- Add missing tf_keras to req.txt by @JasonZhu1313 in #161
- Re-enable GPU CI enforce by @helloworld1 in #142
- Bump package ver by @yundai424 in #163
- Update version in setup.py to 0.2.0 by @yundai424 in #164
New Contributors
- @chiwanpark made their first contribution in #69
- @pchng made their first contribution in #77
- @eltociear made their first contribution in #84
- @arvindsun made their first contribution in #79
- @davidgonmar made their first contribution in #85
- @leaf-soba made their first contribution in #91
- @Tcc0403 made their first contribution in #93
- @tyler-romero made their first contribution in #116
- @Luke-Chesley made their first contribution in #111
- @AndreSlavescu made their first contribution in #149
- @shaoruu made their first contribution in #157
Full Changelog: v0.1.1...v0.2.0