Skip to content

Release v0.1.18

Latest
Compare
Choose a tag to compare
@Ying1123 Ying1123 released this 04 Jul 06:35
· 9 commits to main since this release

Highlight

  • 2x large batch prefill improvement with the new flashinfer kernels #579
  • Multi-node tensor parallelism #550
  • New model support: ChatGLM #516

What's Changed

New Contributors

Full Changelog: v0.1.17...v0.1.18