Skip to content

Which bert training program should I use? #46

Answered by Cryolite
timercrack asked this question in Q&A
Discussion options

You must be logged in to vote

In phase1, the model learns to imitate human behavior in the training data. In phase2, it learns to predict the point delta of each round based on the choices made in the training data. The training data contains both human actions and the point delta of each round, and the only difference between phase1 and phase2 is what they calculate as their objective function.

Replies: 1 comment 5 replies

Comment options

You must be logged in to vote
5 replies
@timercrack
Comment options

@Cryolite
Comment options

Answer selected by timercrack
@timercrack
Comment options

@Cryolite
Comment options

@timercrack
Comment options

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
3 participants