Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

playground推理过慢 #42

Open
nieallen opened this issue Apr 12, 2023 · 2 comments
Open

playground推理过慢 #42

nieallen opened this issue Apr 12, 2023 · 2 comments

Comments

@nieallen
Copy link

请问playground每次generate都要加载一次模型?怎么改可以使速度变快一些呢?

@HarderThenHarder
Copy link
Owner

HarderThenHarder commented Apr 12, 2023

Hi,正常来讲当前 playground 应该只会加载一次模型,只有在刷新页面的时候才会重新加载模型。

我在 这里 进行了模型的缓存,只有当刷新页面(清除缓存)后才会重新加载。

生成速度慢可能有两个原因:

  1. 要求模型生成的文本过长,这将会延长模型推理时间。
  2. 使用 LoRA 加载,而非原始模型加载,这也可能会小部分影响推理时延。您可以使用最新的代码训练模型,模型在保存时将会保存为原始模型的结构(而非 LoRA Adaptor)。

@nieallen
Copy link
Author

感谢,已解决。请问后面会实现一下基于bloom或者glm的sft训练代码吗?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants