v1.4.1
Highlights
- Mamba support by @drbh in #1480 and by @Narsil in #1552
- Experimental support for cuda graphs by @OlivierDehaene in #1428
- Outlines guided generation by @drbh in #1539
- Added
name
field to OpenAI compatible API Messages by @amihalik in #1563
What's Changed
- Fixing top_n_tokens. by @Narsil in #1497
- Sending compute type from the environment instead of hardcoded string by @Narsil in #1504
- Create the compute type at launch time (if not provided in the env). by @Narsil in #1505
- Modify default for max_new_tokens in python client by @freitng in #1336
- feat: eetq gemv optimization when batch_size <= 4 by @dtlzhuangz in #1502
- fix: improve messages api docs content and formatting by @drbh in #1506
- GPTNeoX: Use static rotary embedding by @dwyatte in #1498
- Hotfix the / health - route. by @Narsil in #1515
- fix: tokenizer config should use local model path when possible by @drbh in #1518
- Updating tokenizers. by @Narsil in #1517
- [docs] Fix link to Install CLI by @pcuenca in #1526
- feat: add ie update to message docs by @drbh in #1523
- feat: use existing add_generation_prompt variable from config in temp… by @drbh in #1533
- Update to peft 0.8.2 by @Stillerman in #1537
- feat(server): add frequency penalty by @OlivierDehaene in #1541
- chore: bump ci rust version by @drbh in #1543
- ROCm AWQ support by @IlyasMoutawwakil in #1514
- feat(router): add max_batch_size by @OlivierDehaene in #1542
- feat: add deserialize_with that handles strings or objects with content by @drbh in #1550
- Fixing glibc version in the runtime. by @Narsil in #1556
- Upgrade intermediary layer for nvidia too. by @Narsil in #1557
- Improving mamba runtime by using updates by @Narsil in #1552
- Small cleanup. by @Narsil in #1560
- Bugfix: eos and bos tokens positions are inconsistent by @amihalik in #1567
- chore: add pre-commit by @OlivierDehaene in #1569
- feat: add chat template struct to avoid tuple ordering errors by @OlivierDehaene in #1570
- v1.4.1 by @OlivierDehaene in #1568
New Contributors
- @freitng made their first contribution in #1336
- @dtlzhuangz made their first contribution in #1502
- @dwyatte made their first contribution in #1498
- @pcuenca made their first contribution in #1526
- @Stillerman made their first contribution in #1537
- @IlyasMoutawwakil made their first contribution in #1514
- @amihalik made their first contribution in #1563
Full Changelog: v1.4.0...v1.4.1