Skip to content

Pull requests: huggingface/trl

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

feat: add TPO trainer
#5591 opened Apr 18, 2026 by JeanKaddour Draft
4 of 8 tasks
Add tiny Qwen3-4B-Instruct-2507
#5586 opened Apr 17, 2026 by qgallouedec Member Loading…
[docs] Add chat templates page to web docs
#5581 opened Apr 17, 2026 by sergiopaniego Member Loading…
8 tasks
Update AsyncGRPO example with GSM8K and tested hyperparameters
#5580 opened Apr 17, 2026 by sergiopaniego Member Loading…
8 tasks
Chunked Cross-Entropy
#5575 opened Apr 17, 2026 by qgallouedec Member Draft
Add training chat template for Qwen3-2507
#5574 opened Apr 16, 2026 by SwayamInSync Contributor Loading…
refactor: self distillation trainers (sdpo/sdft/...)
#5573 opened Apr 16, 2026 by LeonEricsson Collaborator Loading…
2 of 8 tasks
Improve BrowserGym examples for latest OpenEnv version
#5568 opened Apr 16, 2026 by sergiopaniego Member Loading…
8 tasks
Set _tokenizer attribute in experimental trainers
#5566 opened Apr 16, 2026 by albertvillanova Member Loading…
DataCollatorForPreference checking 'margin' in all examples
#5564 opened Apr 15, 2026 by antoinsader Loading…
5 of 8 tasks
Revert VLM support in parse_response
#5561 opened Apr 15, 2026 by qgallouedec Member Loading…
Accept processor in get_training_chat_template
#5560 opened Apr 15, 2026 by qgallouedec Member Loading…
Check prefix preservation at the token level
#5559 opened Apr 15, 2026 by qgallouedec Member Loading…
Move experimental example scripts into their trainer folders
#5556 opened Apr 15, 2026 by sergiopaniego Member Loading…
1 of 8 tasks
Add support for prompt-completion format in DistillationTrainer
#5555 opened Apr 15, 2026 by cmpatino Collaborator Loading…
3 of 6 tasks
Fix GRPO VLM tests: Multimodal training requires conversational prompts
#5550 opened Apr 15, 2026 by kaixuanliu Contributor Loading…
3 tasks done
Drop vLLM 0.11 support
#5549 opened Apr 14, 2026 by qgallouedec Member Loading…
Differentiate Phi-3 and Phi-3.5 in tests
#5546 opened Apr 14, 2026 by qgallouedec Member Loading…
fix: Pass AsyncGRPOTrainer's processing_class to AsyncRolloutWorker
#5538 opened Apr 14, 2026 by xuanduy04 Contributor Loading…
2 of 8 tasks
feat: add Phi-3 training chat template with generation markers
#5526 opened Apr 12, 2026 by RudrenduPaul Contributor Loading…
2 of 4 tasks
feat: add Gemma/Gemma2 training chat templates with generation markers
#5523 opened Apr 11, 2026 by ps-abhi Loading…
5 of 8 tasks
feat(glm-4-moe): Add {% generation %} markers for training chat template
#5519 opened Apr 10, 2026 by casinca Contributor Loading…
5 of 8 tasks
[WIP] Fix OnlineDPO vLLM server completion handling
#5516 opened Apr 10, 2026 by JohnGiorgi Contributor Draft
5 of 8 tasks
ProTip! Adding no:label will show everything without a label.