We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
您好,看到论文里写的最后的对比训练用的是,一个线性层做的一个打分排序模型?请问这一步是不是没有用的强化学习
The text was updated successfully, but these errors were encountered:
是的,我们目前还没有使用强化学习用于我们的模型训练中,人类偏好模型目前仅用于模型回答的筛选。
Sorry, something went wrong.
好的,感谢您的回答
No branches or pull requests
您好,看到论文里写的最后的对比训练用的是,一个线性层做的一个打分排序模型?请问这一步是不是没有用的强化学习
The text was updated successfully, but these errors were encountered: