Skip to content

Releases: linkedin/Liger-Kernel

v0.5.5: Chunk size fixes for JSD; KTO speed fixes; better metrics tests

14 Mar 00:27
a6dc70d
Compare
Choose a tag to compare

What's Changed

New Contributors

Full Changelog: v0.5.4...v0.5.5

v0.5.4: Granite 3.0 & 3.1, OLMo2, GRPO, TVD loss, and minor fixes

24 Feb 21:59
911db5d
Compare
Choose a tag to compare

What's Changed

New Contributors

Full Changelog: v0.5.3...v0.5.4

v0.5.3: Minor fixes for post-training losses and support for KTO Loss

10 Feb 23:29
80b409a
Compare
Choose a tag to compare

What's Changed

New Contributors

Full Changelog: v0.5.2...v0.5.3

v0.5.2: Fix Qwen2VL mrope for transformer>=4.47

11 Dec 05:58
966eb73
Compare
Choose a tag to compare

What's Changed

  • Disable Qwen2 VL test for with logits conv test by @ByronHsu in #463
  • Fix Qwen2VL mrope for transformers 4.47.0 by @li-plus in #464
  • Revert Workaround of Disabling QWEN2_VL in Convergence Tests by @austin362667 in #466

Full Changelog: v0.5.1...v0.5.2

v0.5.1: Patch Fix Import Error

10 Dec 09:30
62a3c7d
Compare
Choose a tag to compare

What's Changed

Full Changelog: v0.5.0...v0.5.1

v0.5.0: First open source optimized Post Training Loss, AMD CI, XPU Support

10 Dec 03:30
37ffbe9
Compare
Choose a tag to compare

Highlights

  1. Post Training Loss: Introducing the first open-source optimized post-training losses in Liger Kernel with ~80% memory reduction, featuring DPO, CPO, ORPO, SimPO, JSD, and more. No more OOM nightmares for post-training ML researchers!
image
  1. AMD CI: With AMD’s generous sponsorship of MI300s, we’ve integrated them into our CI. Special thanks to Embedded LLM for building the AMD CI infrastructure. #428
  2. XPU Support: In collaboration with Intel, we now support XPU, demonstrating comparable performance gains with other vendors. #407

What's Changed

New Contributors

v0.4.2: Fix 'RMSNorm' object has no attribute 'in_place'

17 Nov 19:22
cbebed6
Compare
Choose a tag to compare

Highlights

Fix #390 #383

What's Changed

Full Changelog: v0.4.1...v0.4.2

v0.4.1: Gemma 2 Support, CrossEntropy Patching FIx, and GroupNorm

12 Nov 23:42
d784664
Compare
Choose a tag to compare

Highlights

  1. Gemma 2 Support: The long pending gemma 2 is finally supported thanks to @Tcc0403! He has implemented the nasty softcapping in fused linear cross entropy (#320) and discovered the convergence issue which later fixed by @ByronHsu and @Tcc0403 together. (#376)

  2. CrossEntropy Patching FIx: If you use monkey patch for CrossEntropy (Not FLCE), it is actually not patched after transformers 4.46.1. This is because CrossEntropy was replaced with F.cross_entropy in the model code. We fixed the issue in the PR (#375)

  3. GroupNorm Kernel: Our new contributor @pramodith implemented a GroupNorm kernel #375 with 2x Speedup.

What's Changed

New Contributors

Full Changelog: v0.4.0...v0.4.1

v0.4.0: Full AMD support, Tech Report, Modal CI, Llama-3.2-Vision!

05 Nov 22:15
e985195
Compare
Choose a tag to compare

Highlights

  1. AMD GPU: We have partnered with Embedding LLM to adjust the Triton configuration to fully support AMD! With version 0.4.0, you can run multi-GPU training with 26% higher speed and 60% lower memory usage on AMD. See the full blogpost from https://embeddedllm.com/blog/cuda-to-rocm-portability-case-study-liger-kernel. @Edenzzzz @DocShotgun @tjtanaa

  2. Technical Report: We have published a technical report on arXiv (https://arxiv.org/pdf/2410.10989) with abundant details.

  3. Modal CI: We have moved our entire GPU CI stack to Modal! Thanks to intelligent Docker layer caching and blazingly fast container startup time and scheduling, we have reduced the CI overhead by over 10x (from minutes to seconds).

  4. LLaMA 3.2-Vision Model: We have added kernel support for the LLaMA 3.2-Vision model. You can easily use liger_kernel.transformers.apply_liger_kernel_to_mllama to patch the model. @tyler-romero @shivam15s

  5. JSD Kernel: We have added the JSD kernel for distillation, which also comes with a chunking version! @Tcc0403 @yundai424 @qingquansong

  6. HuggingFace Gradient Accumulation Fixes: We have fixed the notorious HuggingFace gradient accumulation issue (huggingface/transformers#34191) by carefully adjusting the cross entropy scalar. You can now safely use v0.4.0 with the latest HuggingFace gradient accumulation fixes (transformers>=4.46.2)!

What's Changed

New Contributors

Full Changelog: v0.3.1...v0.4.0

v0.3.1: Patch Release

01 Oct 20:55
1520999
Compare
Choose a tag to compare

Summary

This patch release brings important updates and fixes to Liger-Kernel. Notable changes include:

  • KLDiv calculation fix: KLDiv now functions correctly with larger vocab sizes
  • SwiGLU/GeGLU casting fix: Program IDs are now cast to int64 in SwiGLU/GeGLU kernels to prevent memory errors with larger dimensions.
  • AutoLigerKernelForCausalLM fix: The model now properly passes through all original keyword arguments
  • Post-init model patching fix: Fix to post-init model patching to ensure HF Trainer integration works correctly
  • Relaxed transformers dependency: Improve compatibility with a broader range of versions.

What's Changed

New Contributors

Full Changelog: v0.3.0...v0.3.1