Fix typo in GRPO quickstart #4020

dwisdom0 · 2025-09-05T23:50:59Z

What does this PR do?

This PR updates the GRPO example in the quickstart to make it work. Before this PR, the example referenced a keyword argument reward_function that doesn't exist on the GRPOTrainer. After this PR, a user can copy/paste the example and have it run correctly.

The example in the GRPOTrainer reference documentation has the correct keyword argument reward_funcs.
https://huggingface.co/docs/trl/en/grpo_trainer#trl.GRPOTrainer.example

trainer = GRPOTrainer(
    model="Qwen/Qwen2-0.5B-Instruct",
    reward_funcs=reward_func,
    train_dataset=dataset,
)

Before (erroring out)

Traceback (most recent call last):
  File "/Users/freebie/code/python/llm_chess_rlvr/grpo_hello_world.py", line 9, in <module>
    trainer = GRPOTrainer(
              ^^^^^^^^^^^^
TypeError: GRPOTrainer.__init__() got an unexpected keyword argument 'reward_function'

After (running successfully)

config.json: 100%|██████████████████████████████████████████████████████████████████████████████████| 659/659 [00:00<00:00, 8.35MB/s]
model.safetensors: 100%|██████████████████████████████████████████████████████████████████████████| 988M/988M [00:12<00:00, 81.9MB/s]
generation_config.json: 100%|███████████████████████████████████████████████████████████████████████| 242/242 [00:00<00:00, 3.79MB/s]
tokenizer_config.json: 7.30kB [00:00, 22.6MB/s]
vocab.json: 2.78MB [00:00, 52.5MB/s]
merges.txt: 1.67MB [00:00, 159MB/s]
tokenizer.json: 7.03MB [00:00, 194MB/s]
The tokenizer has new PAD/BOS/EOS tokens that differ from the model config and generation config. The model config and generation config were aligned accordingly, being updated with the tokenizer's values. Updated tokens: {'bos_token_id': None, 'pad_token_id': 151643}.
  0%|                                                                                                     | 0/350166 [00:00<?, ?it/s]

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a GitHub issue? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes?
Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

qgallouedec

Thanks!

fix typo in GRPO quickstart

93c39c3

qgallouedec approved these changes Sep 6, 2025

View reviewed changes

kashif merged commit f5c2fec into huggingface:main Sep 6, 2025

dwisdom0 deleted the patch-1 branch September 6, 2025 17:06

SamY724 pushed a commit to SamY724/trl that referenced this pull request Sep 6, 2025

Fix typo in GRPO quickstart (huggingface#4020)

ec98525

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix typo in GRPO quickstart #4020

Fix typo in GRPO quickstart #4020

Uh oh!

dwisdom0 commented Sep 5, 2025

Uh oh!

qgallouedec left a comment

Uh oh!

Uh oh!

Fix typo in GRPO quickstart #4020

Fix typo in GRPO quickstart #4020

Uh oh!

Conversation

dwisdom0 commented Sep 5, 2025

What does this PR do?

Before submitting

Who can review?

Uh oh!

qgallouedec left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!