Release v0.1.11
Highlights
- Serve the official release demo of LLaVA v1.6 blog
- Support Yi-VL example
- Faster JSON decoding blog
- Support QWen 2
What's Changed
- Fix the error message and dependency of openai backend by @merrymercy in #71
- Add an async example by @Ying1123 in #37
- Add a note about triton version for older GPUs by @merrymercy in #72
- Support load fine-tuned LLaVA model by @isaac-vidas in #80
- Suppport qwen model and solve some problems by @Arcmoon-Hu in #75
- Fix after QWen support by @merrymercy in #82
- Fix the chat template for QWen by @merrymercy in #83
- Fix SRT endpoint api json syntax by @CSWellesSun in #84
- Return logprob for choices by @merrymercy in #87
- Add health endpoint to SGLang runtime server by @isaac-vidas in #90
- Llava-hd Support by @caoshiyi in #92
- Bump the version to v0.1.8 by @merrymercy in #93
- Improve Chinese character streaming when the last char is half Chinese word. by @haotian-liu in #95
- Handle grayscale images in expand2square by @isaac-vidas in #97
- support speculative execution for openai API by @parasol-aser in #48
- fix batch error for llava-hd by @caoshiyi in #98
- Dynamic model class loading by @comaniac in #101
- Flush Cache API by @hnyls2002 in #103
- Fix Mistral model loading by @comaniac in #108
- Improve the control of streaming and improve the first token latency in streaming by @merrymercy in #117
- Add qwen2 by @JustinLin610 in #114
- Format code by @merrymercy in #118
- Update quick start examples by @merrymercy in #120
- Improve docs & Add JSON decode example by @merrymercy in #121
- [Feature] Adds basic support for image content in OpenAI chat routes by @fozziethebeat in #113
- [Feature] Allow specifying all ports to use in advance by @Ja1Zhou in #116
- Add cache metrics by @comaniac in #119
- Fix model loading & format code by @merrymercy in #125
- Add city doc benchmark mode by @hnyls2002 in #129
- Yi-VL Model by @BabyChouSr in #112
- Fix
is_multimodal_model
judge by @hnyls2002 in #132 - Add max_prefill_num_token into server arguments by @Ying1123 in #133
- Release 0.1.11 by @Ying1123 in #134
New Contributors
- @isaac-vidas made their first contribution in #80
- @Arcmoon-Hu made their first contribution in #75
- @CSWellesSun made their first contribution in #84
- @haotian-liu made their first contribution in #95
- @parasol-aser made their first contribution in #48
- @JustinLin610 made their first contribution in #114
- @fozziethebeat made their first contribution in #113
- @Ja1Zhou made their first contribution in #116
Full Changelog: v0.1.6...v0.1.11