-
Notifications
You must be signed in to change notification settings - Fork 47
Open
Labels
good first issueGood for newcomersGood for newcomers
Description
Good First Issue List
Trinity-RFT is a flexible and modular Reinforcement Fine-Tuning framework. We welcome contributions from the community to help improve and expand the framework.
As a starting point, here are some good first issues that new contributors can work on:
-
Implement a New Workflow
- Description: Create a new agentic workflow to tackle specific tasks.
- Difficulty: Easy or Medium
- Labels:
good first issue,workflow - Resources: Check out the Workflow Development Guide for step-by-step instructions.
- Examples:
- Gomoku, Sudoku, or other interesting game playing workflows
- Workflows to solve benchmarks like Multi-hop QA, MuSiQue, etc.
- Workflows to adapt to popular agent frameworks like LangChain, AutoGen, etc.
- ...
-
Implement a New RL Algorithm
- Description: Implement a new reinforcement learning algorithm to improve training efficiency or performance.
- Difficulty: Medium
- Labels:
good first issue,algorithm - Resources: Refer to the Algorithm Development Guide for implementation tutorials.
- Examples:
-
Implement a New Experience Operator
- Description: Develop a new operator for experience data filtering, augmentation, or reward shaping.
- Difficulty: Easy or Medium
- Labels:
good first issue,operator - Resources: See the Operator Development Guide for guidance.
- Examples:
- Implement an operator to filter out low-quality experiences based on predefined criteria.
- Implement an operator to refine rewards by comparing experiences generated from different runs of the same task.
- ...
-
Improve Examples and Documentation
- Description: Enhance the existing examples and documentation to help new users get started with Trinity-RFT.
- Difficulty: Easy
- Labels:
good first issue,documentation - Resources: Check the existing Examples and Documentation for areas of improvement.
- Examples:
- Add examples for workflows or algorithms implemented but not yet documented in the examples directory.
- Improve existing documentation for clarity and completeness.
- ...
Besides these tasks for beginners, we also have more challenging issues for experienced contributors, such as:
- Reduce the bubble caused by decoupled Explorer / Trainer to improve resource utilization.
- Improve the efficiency of the experience buffer.
- Add partial rollout support to the Explorer to avoid resource waste caused by the long-tail effect of rollouts in agentic RL scenarios.
- Add popular inference backends like SGLang.
- ...
If you're interested in working on any of these issues, please feel free to comment on the issue or open a pull request. We look forward to your contributions!
Wangzy455 and yanxi-chen
Metadata
Metadata
Assignees
Labels
good first issueGood for newcomersGood for newcomers