Skip to content

Pull requests: huggingface/trl

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

feat: implement DeepSeek unbiased KL estimator for GRPO
#4638 opened Dec 7, 2025 by jlcanta Loading…
2 tasks done
Preserve truncated tokens in BFD packing
#4632 opened Dec 5, 2025 by qgallouedec Loading…
Update docs landing with latest details
#4624 opened Dec 4, 2025 by sergiopaniego Loading…
6 tasks
[online trainers] add vllm lora adapter support
#4590 opened Nov 27, 2025 by kashif Loading…
5 tasks
Add PSPO trust region method as alternative to clipping in GRPOTrainer
#4548 opened Nov 19, 2025 by MCDwyer Loading…
2 of 5 tasks
fix: add vllm_group_port
#4545 opened Nov 19, 2025 by pointerhacker Loading…
3 of 5 tasks
Add compute_metrics parameter for GRPOTrainer
#4534 opened Nov 17, 2025 by colinzhaoxp Loading…
Make skip_special_tokens configurable
#4521 opened Nov 13, 2025 by taha-yassine Loading…
3 of 5 tasks
[GRPO] switch grpo liger loss to triton version
#4519 opened Nov 13, 2025 by kashif Loading…
1 of 8 tasks
adding [SimPER](https://arxiv.org/abs/2502.00883)
#4486 opened Nov 6, 2025 by leeparkuky Loading…
2 of 5 tasks
added 10 papers (+trainer cross-links) for #4407
#4441 opened Nov 3, 2025 by SSusantAchary Loading…
4 tasks done
refactor: simplify parameter freezing in modeling_base.py
#4305 opened Oct 20, 2025 by Ki-Seki Loading…
2 of 5 tasks
Add CISPO loss option and documentation
#4298 opened Oct 16, 2025 by gustavorubim Loading…
Fix DPO Trainer Bug For Qwen2-VL (Issue 2660)
#4257 opened Oct 11, 2025 by FabianSchuetze Loading…
1 of 3 tasks
Online-dpo-ben
#4252 opened Oct 10, 2025 by burtenshaw Draft
5 tasks
Add support for Python 3.14
#4225 opened Oct 8, 2025 by albertvillanova Loading…
ProTip! Updated in the last three days: updated:>2025-12-05.