[router] LSH based prefix cache aware router #672

gaocegege · 2025-02-14T04:32:07Z

🚀 Feature Description and Motivation

Right now, we're using xxhash in #641 for our prefix cache-aware router. We might consider switching to a consistent hash + LSH-based approach, which could reduce accuracy a bit but would simplify scaling. Here are some related discussions: vllm-project/production-stack#59 (comment).

Use Case

N/A

Proposed Solution

No response

Jeffwan · 2025-02-14T18:44:46Z

@varungup90 @DwyaneShi can you spend some time on this issue?

gaocegege · 2025-02-15T05:04:30Z

It's just a proposal; I don't know if it helps in the chat use case. But it works well with long document QA. Ref vllm-project/production-stack#59 (comment)

Jeffwan assigned DwyaneShi and varungup90 Feb 14, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[router] LSH based prefix cache aware router #672

[router] LSH based prefix cache aware router #672

gaocegege commented Feb 14, 2025

Jeffwan commented Feb 14, 2025

gaocegege commented Feb 15, 2025

[router] LSH based prefix cache aware router #672

[router] LSH based prefix cache aware router #672

Comments

gaocegege commented Feb 14, 2025

🚀 Feature Description and Motivation

Use Case

Proposed Solution

Jeffwan commented Feb 14, 2025

gaocegege commented Feb 15, 2025