Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IEKT在数据集Bridge2006跑的时候出现不收敛的情况 #149

Open
MyGithub1234567890 opened this issue Dec 5, 2023 · 2 comments
Open

Comments

@MyGithub1234567890
Copy link

MyGithub1234567890 commented Dec 5, 2023

No description provided.

@MyGithub1234567890
Copy link
Author

f3a2747b61de6ce8701217ff30fa6ee

@sonyawong
Copy link
Collaborator

f3a2747b61de6ce8701217ff30fa6ee

我理解在IEKT的训练过程中引入了Policy Gradient的强化学习算法(论文section4.3 Model Learning), 所以loss会出现震荡. 不过可以看到valid auc一直有在上升, 模型一直有在学, 直到达到我们设定的early stop.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants