generated from fastai/nbdev_template
-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Issues: huggingface/trl
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
[Question] Why is Importance Sampling and Clipping applied in RLOO?
#2341
opened Nov 10, 2024 by
shashankg7
Multiple Errors with PPOTrainer. error in ppo_trainer.dataloader
🐛 bug
Something isn't working
🏋 PPO
Related to PPO
#2340
opened Nov 10, 2024 by
Debolena7
Difference between SFTTrainer and Seq2seqTrainer
❓ question
Seeking clarification or more information
🏋 SFT
Related to SFT
#2339
opened Nov 9, 2024 by
Hyfred
RuntimeError: chunk expects at least a 1-dimensional tensor
🐛 bug
Something isn't working
🏋 SFT
Related to SFT
#2338
opened Nov 8, 2024 by
imrankh46
4 tasks done
DPO Training DataLoader is not shuffled
🏋 DPO
Related to DPO
✨ enhancement
New feature or request
#2337
opened Nov 7, 2024 by
kaiwenw
4 tasks
Accelerator package version problem
🐛 bug
Something isn't working
🚀 deepspeed
Related to deepspeed
🏋 PPO
Related to PPO
#2335
opened Nov 7, 2024 by
littleshutong
2 of 4 tasks
RLooTrainer bug when using deepspeed
🐛 bug
Something isn't working
🚀 deepspeed
Related to deepspeed
🏋 RLOO
Related to RLOO
#2329
opened Nov 6, 2024 by
macheng6
2 of 4 tasks
Support for MiniCPM-V Reinforcement Learning with Direct Preference Optimization (DPO)
🏋 DPO
Related to DPO
❓ question
Seeking clarification or more information
👁️ VLM
Related to Visual Language Models
#2326
opened Nov 5, 2024 by
DarioPTWR
Several problems in RLOOTrainer
❓ question
Seeking clarification or more information
🏋 RLOO
Related to RLOO
#2316
opened Nov 4, 2024 by
serendipity800
2 of 4 tasks
RLOOTrainer ignores custom DataCollatorWithPadding in favor of default one
🐛 bug
Something isn't working
🏋 RLOO
Related to RLOO
#2309
opened Nov 2, 2024 by
anch0vy
Using a different New feature or request
❓ question
Seeking clarification or more information
ref_model
from model
leads to incorrect results
✨ enhancement
#2307
opened Nov 1, 2024 by
DarshanDeshpande
2 of 4 tasks
Whether chatglm3 6b is supported by trl ?
🐛 bug
Something isn't working
#2299
opened Oct 31, 2024 by
fjy01
2 of 4 tasks
Code migration suggestions
🏋 DPO
Related to DPO
⏳ needs more info
Additional information or clarification is required to proceed
❓ question
Seeking clarification or more information
#2296
opened Oct 30, 2024 by
MonolithFoundation
OOM when finetuning Llama3.2-90B on 8xA100 80GB
#2294
opened Oct 29, 2024 by
maximilianmordig
2 of 4 tasks
wrong objective/entropy in RLOOTrainer
🐛 bug
Something isn't working
🏋 RLOO
Related to RLOO
#2281
opened Oct 25, 2024 by
serendipity800
1 of 4 tasks
Feature Request: String-Based Comparison Reward model for RLOOTrainer
✨ enhancement
New feature or request
🏋 RLOO
Related to RLOO
#2280
opened Oct 25, 2024 by
HiroshigeAoki
Conflict between last version of Transformers.Trainer and DPOTrainer.get_batch_samples
🐛 bug
Something isn't working
🏋 DPO
Related to DPO
#2275
opened Oct 24, 2024 by
lucasdegeorge
2 of 4 tasks
Helper function for getting reward model and judge
✨ enhancement
New feature or request
#2271
opened Oct 24, 2024 by
qgallouedec
KTOTrainer Memory Leakage
🐛 bug
Something isn't working
🏋 KTO
Related to KTO
#2268
opened Oct 24, 2024 by
Isaaclgz
2 of 4 tasks
Significant Difference between torchrun launch and accelerate launch
❓ question
Seeking clarification or more information
#2262
opened Oct 21, 2024 by
SinclairCoder
2 of 4 tasks
OOM when unwrap_model_for_generation
🐛 bug
Something isn't working
#2250
opened Oct 18, 2024 by
hlnchen
2 of 4 tasks
Add model merging callback
✨ enhancement
New feature or request
🧒 good second issue
Good for contributors with basic project familiarity
#2241
opened Oct 16, 2024 by
lewtun
online DPO evaluation
🐛 bug
Something isn't working
🏋 Online DPO
Related to Online DPO
#2228
opened Oct 14, 2024 by
woshizouguo
1 of 4 tasks
[Trainer] Changing the dataset dynamically during training
🏋 DPO
Related to DPO
❓ question
Seeking clarification or more information
#2227
opened Oct 14, 2024 by
ilyasoulk
Previous Next
ProTip!
Exclude everything labeled
bug
with -label:bug.