Skip to content

v0.8.0

Compare
Choose a tag to compare
@OlivierDehaene OlivierDehaene released this 30 May 16:45
· 896 commits to main since this release

Features

  • router: support vectorized warpers in flash causal lm (co-authored by @jlamypoirier )
  • proto: decrease IPC proto size
  • benchmarker: add summary tables
  • server: support RefinedWeb models

Fix

  • server: Fix issue when load AutoModelForSeq2SeqLM model (contributed by @CL-Shang)

New Contributors

Full Changelog: v0.7.0...v0.8.0