Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CGPO] Mixture of judges #2159

Open
wants to merge 38 commits into
base: main
Choose a base branch
from

Conversation

gaetanlop
Copy link
Contributor

What does this PR do?

This PR adds the Mixture of judges part of the CGPO paper (https://arxiv.org/pdf/2409.20370). The judges are described in section 4.1.4 and the mixture of judges simply labels a generation as “violated” (0) if it fails any one of the constraint judgments and “satisfied” (1) otherwise.

Related to #2156

Before submitting

  • Did you read the contributor guideline,
    Pull Request section?
  • Was this discussed/approved via a GitHub issue? Please add a link
    to it if that's the case.
  • Did you make sure to update the documentation with your changes? Here are the
    documentation guidelines.
  • Did you write any new necessary tests?

Who can review?

@kashif @lewtun

@gaetanlop
Copy link
Contributor Author

gaetanlop commented Oct 3, 2024

Still a draft PR Ready

@gaetanlop gaetanlop marked this pull request as ready for review October 4, 2024 01:01
trl/trainer/judges.py Outdated Show resolved Hide resolved
trl/trainer/judges.py Outdated Show resolved Hide resolved
trl/trainer/judges.py Outdated Show resolved Hide resolved
trl/trainer/judges.py Outdated Show resolved Hide resolved
trl/trainer/judges.py Outdated Show resolved Hide resolved
@qgallouedec qgallouedec added the 👨‍⚖️ judge Related to judges label Oct 4, 2024
@qgallouedec
Copy link
Member

Thanks a lot @gaetanlop. Added some suggestion and open questions

@qgallouedec
Copy link
Member

Also, please make sure to run the pre-commits (make precommit)

@gaetanlop
Copy link
Contributor Author

Hey @qgallouedec, thanks for the review. I have added a gold_answers parameter to the judge function for both the base and MoJs class. Also, I have added a safety_judge and factuality_judge as described in the CGPO paper. They also have rule based judges, but in my opinion they are a little bit too tailored to specific tasks (coding/maths) to be added to the trl library. If you think the factuality and safety judges are also too specific to be in the lib I can remove them from the PR.

For the naming of the judges, let's do BaseBinaryJudge for the base class and AllTrueJudge for the moj following your suggestions?

trl/trainer/judges.py Outdated Show resolved Hide resolved
@qgallouedec
Copy link
Member

Hey @qgallouedec, thanks for the review. I have added a gold_answers parameter to the judge function for both the base and MoJs class. Also, I have added a safety_judge and factuality_judge as described in the CGPO paper. They also have rule based judges, but in my opinion they are a little bit too tailored to specific tasks (coding/maths) to be added to the trl library. If you think the factuality and safety judges are also too specific to be in the lib I can remove them from the PR.

Thanks a lot! Nice work!

For the naming of the judges, let's do BaseBinaryJudge for the base class and AllTrueJudge for the moj following your suggestions?

LGTM.


WDYT of having generic classes in trl.judges (AllTrueJudge, BinaryJudge etc.) and subclass them in trl.trainer.cgpo_trainer to get SafetyConstraintJudge and FacultyConstraintJudge? If in the future we need these classes elsewhere, we can still move them in trl.judges

@gaetanlop
Copy link
Contributor Author

Ok, I have removed the FacultyConstraintJudge and the SafetyConstraintJudge from the PR and made the required renamings. Thanks @qgallouedec for the feedback.

trl/trainer/judges.py Outdated Show resolved Hide resolved

@abstractmethod
def judge(
self, prompts: List[str], completions: List[str], gold_answers: List[str] = None, shuffle_order: bool = True
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe "gold_completion" or even "ref_completion" would be more suited, what do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"gold_completions" seems good to me. I have done the update.

trl/trainer/judges.py Outdated Show resolved Hide resolved
completions: List[str],
gold_completions: Optional[List[str]] = None,
shuffle_order: bool = True,
) -> List[bool]:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should return a list of int to be consistent with the super class

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed!

trl/trainer/judges.py Outdated Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
👨‍⚖️ judge Related to judges
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants