Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SGLang Integration #8

Open
merrymercy opened this issue Feb 11, 2024 · 2 comments
Open

SGLang Integration #8

merrymercy opened this issue Feb 11, 2024 · 2 comments

Comments

@merrymercy
Copy link

Nice project!

I believe this project can greatly benefit from https://github.com/sgl-project/sglang. You can try to use SGLang as a backend for local models.

  • The fast JSON decoding feature can help you force additional constraints and probably help the nested JSON schemas. You can find the example here.
  • The RadixAttention feature can help you reuse the KV cache for the shared prefix. You can find one example on parallel decoding here.
@varunshenoy
Copy link
Owner

This is next on the list! Huge fan of the work you guys have done and think the constrained sampling + KV cache reuse will be a game changer with Super JSON Mode.

@dailydaniel
Copy link

@varunshenoy Are you planning to add the ability to use it with the SGLang OpenAI like server?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants