Skip to content

Issues: huggingface/trl

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Assignee
Filter by who’s assigned
Sort

Issues list

Multiple Errors with PPOTrainer. error in ppo_trainer.dataloader 🐛 bug Something isn't working 🏋 PPO Related to PPO
#2340 opened Nov 10, 2024 by Debolena7
Difference between SFTTrainer and Seq2seqTrainer ❓ question Seeking clarification or more information 🏋 SFT Related to SFT
#2339 opened Nov 9, 2024 by Hyfred
RuntimeError: chunk expects at least a 1-dimensional tensor 🐛 bug Something isn't working 🏋 SFT Related to SFT
#2338 opened Nov 8, 2024 by imrankh46
4 tasks done
DPO Training DataLoader is not shuffled 🏋 DPO Related to DPO ✨ enhancement New feature or request
#2337 opened Nov 7, 2024 by kaiwenw
4 tasks
Accelerator package version problem 🐛 bug Something isn't working 🚀 deepspeed Related to deepspeed 🏋 PPO Related to PPO
#2335 opened Nov 7, 2024 by littleshutong
2 of 4 tasks
RLooTrainer bug when using deepspeed 🐛 bug Something isn't working 🚀 deepspeed Related to deepspeed 🏋 RLOO Related to RLOO
#2329 opened Nov 6, 2024 by macheng6
2 of 4 tasks
Support for MiniCPM-V Reinforcement Learning with Direct Preference Optimization (DPO) 🏋 DPO Related to DPO ❓ question Seeking clarification or more information 👁️ VLM Related to Visual Language Models
#2326 opened Nov 5, 2024 by DarioPTWR
Several problems in RLOOTrainer ❓ question Seeking clarification or more information 🏋 RLOO Related to RLOO
#2316 opened Nov 4, 2024 by serendipity800
2 of 4 tasks
Using a different ref_model from model leads to incorrect results ✨ enhancement New feature or request ❓ question Seeking clarification or more information
#2307 opened Nov 1, 2024 by DarshanDeshpande
2 of 4 tasks
Whether chatglm3 6b is supported by trl ? 🐛 bug Something isn't working
#2299 opened Oct 31, 2024 by fjy01
2 of 4 tasks
Code migration suggestions 🏋 DPO Related to DPO ⏳ needs more info Additional information or clarification is required to proceed ❓ question Seeking clarification or more information
#2296 opened Oct 30, 2024 by MonolithFoundation
OOM when finetuning Llama3.2-90B on 8xA100 80GB
#2294 opened Oct 29, 2024 by maximilianmordig
2 of 4 tasks
wrong objective/entropy in RLOOTrainer 🐛 bug Something isn't working 🏋 RLOO Related to RLOO
#2281 opened Oct 25, 2024 by serendipity800
1 of 4 tasks
Helper function for getting reward model and judge ✨ enhancement New feature or request
#2271 opened Oct 24, 2024 by qgallouedec
KTOTrainer Memory Leakage 🐛 bug Something isn't working 🏋 KTO Related to KTO
#2268 opened Oct 24, 2024 by Isaaclgz
2 of 4 tasks
Significant Difference between torchrun launch and accelerate launch ❓ question Seeking clarification or more information
#2262 opened Oct 21, 2024 by SinclairCoder
2 of 4 tasks
OOM when unwrap_model_for_generation 🐛 bug Something isn't working
#2250 opened Oct 18, 2024 by hlnchen
2 of 4 tasks
Add model merging callback ✨ enhancement New feature or request 🧒 good second issue Good for contributors with basic project familiarity
#2241 opened Oct 16, 2024 by lewtun
online DPO evaluation 🐛 bug Something isn't working 🏋 Online DPO Related to Online DPO
#2228 opened Oct 14, 2024 by woshizouguo
1 of 4 tasks
[Trainer] Changing the dataset dynamically during training 🏋 DPO Related to DPO ❓ question Seeking clarification or more information
#2227 opened Oct 14, 2024 by ilyasoulk
ProTip! Exclude everything labeled bug with -label:bug.