Skip to content

v0.3.0

Due by March 31, 2025 7% complete
  • Model Centric API
  • Offline Inference Performance Improvement
  • Prefix cache + Heterogenous Routing + Fairness request routing
Loading