Skip to content

Policy vs. reference model gradient updates in ch07 on DPO #472

tt7533 started this conversation in General

You must be logged in to vote

Replies: 2 comments 3 replies

You must be logged in to vote
0 replies

You must be logged in to vote
3 replies
@rasbt

rasbt Jan 7, 2025
Maintainer

@tt7533

@rasbt

rasbt Jan 8, 2025
Maintainer

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
2 participants