Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Does Dify have plans to support rerank and embedding models launched by vLLM? #11857

Closed
4 of 5 tasks
massif-01 opened this issue Dec 19, 2024 · 5 comments
Closed
4 of 5 tasks
Labels
🙋‍♂️ question This issue does not contain proper reproduce steps or it only has limited words without details.

Comments

@massif-01
Copy link
Contributor

Self Checks

  • I have searched for existing issues search for existing issues, including closed ones.
  • I confirm that I am using English to submit this report (我已阅读并同意 Language Policy).
  • [FOR CHINESE USERS] 请务必使用英文提交 Issue,否则会被关闭。谢谢!:)
  • Please do not modify this template :) and fill in all the required fields.

1. Is this request related to a challenge you're experiencing? Tell me about your story.

Does Dify have plans to support rerank and embedding models launched by vLLM?
When deploying large models and Dify locally on performance-limited devices (such as Orin X or a Mac mini with 64GB of RAM), using Xinference to launch the models can consume precious memory. Using VLLM to launch various models together is a better choice.

2. Additional context or comments

No response

3. Can you help us with this feature?

  • I am interested in contributing to this feature.
@dosubot dosubot bot added the 🙋‍♂️ question This issue does not contain proper reproduce steps or it only has limited words without details. label Dec 19, 2024
@HiddenPeak
Copy link

+1

@crazywoola
Copy link
Member

Because of this #11588, we no longer accept the prs related to model runtimes.
We are going to launch v1.0 in the near future. After that, I think the community could help with that.

@HiddenPeak
Copy link

vLLM rerank task api :

  • Route: /score, Methods: POST
  • Route: /v1/score, Methods: POST
    I try to modify the route /v1/score to /v1/rerank
    but dify show me the notice paramate error

waiting for your v1.0 😭

Copy link

dosubot bot commented Jan 22, 2025

Hi, @massif-01. I'm Dosu, and I'm helping the Dify team manage their backlog. I'm marking this issue as stale.

Issue Summary

  • You inquired about Dify's plans to support rerank and embedding models from vLLM for better memory efficiency.
  • HiddenPeak supported your inquiry and provided details about vLLM's rerank task API, noting a parameter error when modifying the route.
  • Crazywoola mentioned that due to a related issue, pull requests for model runtimes are not being accepted, but community help might be possible post Dify v1.0 launch.
  • HiddenPeak expressed anticipation for the Dify v1.0 release.

Next Steps

  • Please let us know if this issue is still relevant to the latest version of the Dify repository by commenting here.
  • If there is no further activity, this issue will be automatically closed in 15 days.

Thank you for your understanding and contribution!

@dosubot dosubot bot added the stale Issue has not had recent activity or appears to be solved. Stale issues will be automatically closed label Jan 22, 2025
@dosubot dosubot bot closed this as not planned Won't fix, can't repro, duplicate, stale Feb 6, 2025
@dosubot dosubot bot removed the stale Issue has not had recent activity or appears to be solved. Stale issues will be automatically closed label Feb 6, 2025
@hundredwz
Copy link
Contributor

hundredwz commented Feb 17, 2025

vLLM rerank task api :

  • Route: /score, Methods: POST
  • Route: /v1/score, Methods: POST
    I try to modify the route /v1/score to /v1/rerank
    but dify show me the notice paramate error

waiting for your v1.0 😭

vLLM 0.7.2 now support /v1/rerank.

However, we will face 404 error when trying to add rerank model to dify, since there is a bug in rerank.py

we could modify the code in L67 to following

 data = {"model": model_name, "query": query, "documents": docs, "return_documents": True}
  if top_n is not None:
      data["top_n"] = top_n

If premitted, I could submit pr to fix it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🙋‍♂️ question This issue does not contain proper reproduce steps or it only has limited words without details.
Projects
None yet
Development

No branches or pull requests

4 participants