-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rag回复内容流式输出怎么实现呢 #13
Comments
首先模型要开启流式输出,然后: def stream_large_model_api():
if name == "main": |
感谢回复,不知道您还记得enhance_llm/qa_bot/part_2_plus.py里的下面这段代码吗?我想把langchain的RAG预测输出改成流式输出,也就是下面这段代码改成流式输出。是不是需要改掉 chat_llm_chain.predict这个方法,怎么改呢?是不是要重新写chat_llm_chain链
|
Langchain应该有实现该功能的,看看官方怎么做的 |
感谢解答。看官方文档需要部署模型后,以openAI的方式访问模型,才能实现LCEL的流式输出。方法如下,有需要的同学可参考。 import asyncio async def main(): asyncio.run(main()) |
part_2_plus.py
# 生成回复
res = chat_llm_chain.predict(
# 格式化聊天记录为JSON字符串
chat_history = "\n".join([f"{entry['role']}: {entry['content']}" for entry in self.memory]),
# chat_history = "\n".join(self.memory),
context=context,
question=query
)
这段代码怎么改成流式输出呢,我尝试使用async for chunk in chat_llm_chain.astream()去改,但是失败了,求助有没有大佬知道是调用哪个方式实现流式输出?
The text was updated successfully, but these errors were encountered: