v0.8.0
Features
- router: support vectorized warpers in flash causal lm (co-authored by @jlamypoirier )
- proto: decrease IPC proto size
- benchmarker: add summary tables
- server: support RefinedWeb models
Fix
- server: Fix issue when load AutoModelForSeq2SeqLM model (contributed by @CL-Shang)
New Contributors
- @CL-Shang made their first contribution in #370
- @jlamypoirier made their first contribution in #317
Full Changelog: v0.7.0...v0.8.0