Skip to content

Conversation

@yuki-97
Copy link
Contributor

@yuki-97 yuki-97 commented Nov 11, 2025

Follow up of #1472. Thanks @nv-mmanohara for adding this!

  1. Add GRPO support for HelpSteer3 on LlamaNemotron 49B.
  2. Add SFT support for tulu3 on LlamaNemotron 49B.
  3. Add CodeJaccard environment.
  4. Refactor env and data processor.
  5. Introduce run_grpo.py, will replace run_grpo_math.py and run_grpo_rm.py in a subsequent PR.

Summary by CodeRabbit

  • New Features

    • Added HelpSteer3 and Tulu3 datasets for training and evaluation.
    • Introduced CodeJaccard environment for code-based similarity scoring.
    • Enhanced data processor registry system with improved task flexibility.
    • Added new GRPO training script and SFT configurations for Nemotron-49B models.
    • Support for dynamic task naming across datasets.
  • Documentation

    • Updated custom parallel plan paths.
  • Tests

    • Added new GRPO HelpSteer3 and SFT test suites.

@github-actions github-actions bot added the documentation Improvements or additions to documentation label Nov 11, 2025
@yuki-97 yuki-97 added the CI:L1 Run doctests, unit tests, and functional tests label Nov 11, 2025
@yuki-97 yuki-97 force-pushed the yukih/pr-1472 branch 2 times, most recently from c9335d4 to a872ed6 Compare November 11, 2025 09:27
@yuki-97 yuki-97 removed the CI:L1 Run doctests, unit tests, and functional tests label Nov 11, 2025
@RayenTian RayenTian added the CI:L1 Run doctests, unit tests, and functional tests label Nov 16, 2025
@RayenTian RayenTian removed the CI:L1 Run doctests, unit tests, and functional tests label Nov 16, 2025
@RayenTian RayenTian force-pushed the yukih/pr-1472 branch 2 times, most recently from b7fedb9 to 9078e33 Compare November 16, 2025 03:37
@RayenTian RayenTian added the CI:L1 Run doctests, unit tests, and functional tests label Nov 16, 2025
@RayenTian RayenTian added CI:L1 Run doctests, unit tests, and functional tests and removed CI:L1 Run doctests, unit tests, and functional tests labels Nov 16, 2025
@RayenTian RayenTian added CI:L1 Run doctests, unit tests, and functional tests and removed CI:L1 Run doctests, unit tests, and functional tests labels Nov 17, 2025
@RayenTian RayenTian added CI:L1 Run doctests, unit tests, and functional tests and removed CI:L1 Run doctests, unit tests, and functional tests labels Nov 17, 2025
yuki-97 and others added 7 commits November 20, 2025 01:25
Signed-off-by: Yuki Huang <[email protected]>
Signed-off-by: ruit <[email protected]>
Signed-off-by: Yuki Huang <[email protected]>
Signed-off-by: ruit <[email protected]>
Signed-off-by: Yuki Huang <[email protected]>
Signed-off-by: ruit <[email protected]>
Signed-off-by: Yuki Huang <[email protected]>
Signed-off-by: ruit <[email protected]>
… processors. Added raw_dataset.py and path.py for improved dataset processing. Updated project-includes in pyrefly.toml and modified grpo.md to reflect new task-dataset mapping. Cleaned up unused code and configurations in various YAML files.

Signed-off-by: ruit <[email protected]>
…or handling

    - Introduced documentation for the new Code Jaccard Environment, detailing its functionality, usage, and configuration.
    - Updated RawDataset class to provide a default processor if none is specified in the data configuration.
    - Enhanced test coverage for the helpsteer3 data processor to ensure correct functionality and output.

    Signed-off-by: ruit <[email protected]>

Signed-off-by: ruit <[email protected]>
@github-actions
Copy link

⚠️ File Consistency Check

Check based on commit: eb8aa50 (PR #1506 from yukih/pr-1472)

⚠️ Parallel Plans Synchronization Warning

The file nemo_rl/models/dtensor/parallelize.py was modified in this PR, but neither 3rdparty/Automodel-workspace/Automodel/nemo_automodel/components/distributed/optimized_tp_plans.py nor 3rdparty/Automodel-workspace/Automodel/nemo_automodel/components/distributed/parallelizer.py was updated.

Why this matters:
These files contain similar parallel plan implementations that should be kept synchronized to ensure consistency across the codebase.

Action required:

  • Please review if the changes in nemo_rl/models/dtensor/parallelize.py should also be applied to 3rdparty/Automodel-workspace/Automodel/nemo_automodel/components/distributed/optimized_tp_plans.py or 3rdparty/Automodel-workspace/Automodel/nemo_automodel/components/distributed/parallelizer.py
  • Update the appropriate related file(s) if necessary to maintain functional consistency
  • Request access to the NVIDIA-NeMo/Automodel repository, create a PR against the nemo-rl-submodule branch, and update the Automodel submodule in the nemo-rl index
  • Add @ffrujeri as a reviewer of this PR if you have any questions about the consistency requirements
  • If the files are intentionally different, please add a comment in the PR explaining why

Files to check:

  • Modified: nemo_rl/models/dtensor/parallelize.py
  • Not modified: 3rdparty/Automodel-workspace/Automodel/nemo_automodel/components/distributed/optimized_tp_plans.py
  • Not modified: 3rdparty/Automodel-workspace/Automodel/nemo_automodel/components/distributed/parallelizer.py

✅ DTensor Policy Worker Synchronization Check

Both DTensor policy worker files were modified in this PR:

  • nemo_rl/models/policy/dtensor_policy_worker.py
  • nemo_rl/models/policy/dtensor_policy_worker_v2.py

Please ensure that the changes are consistent between both files where applicable.


This check ensures that related file implementations remain synchronized across the codebase. If you believe this warning is incorrect or the files should intentionally differ, please add a comment explaining the reasoning.

@RayenTian RayenTian added CI:L1 Run doctests, unit tests, and functional tests and removed CI:L1 Run doctests, unit tests, and functional tests labels Nov 20, 2025
- Updated CLEVRCoGenTDataset, OpenAIFormatDataset, and SquadDataset to inherit from the RawDataset class for improved dataset handling.
- Added necessary imports for RawDataset in the respective files.

Signed-off-by: ruit <[email protected]>
@RayenTian RayenTian added CI:L1 Run doctests, unit tests, and functional tests and removed CI:L1 Run doctests, unit tests, and functional tests labels Nov 21, 2025
@github-actions
Copy link

⚠️ File Consistency Check

Check based on commit: e842bfd (PR #1506 from yukih/pr-1472)

⚠️ Parallel Plans Synchronization Warning

The file nemo_rl/models/dtensor/parallelize.py was modified in this PR, but neither 3rdparty/Automodel-workspace/Automodel/nemo_automodel/components/distributed/optimized_tp_plans.py nor 3rdparty/Automodel-workspace/Automodel/nemo_automodel/components/distributed/parallelizer.py was updated.

Why this matters:
These files contain similar parallel plan implementations that should be kept synchronized to ensure consistency across the codebase.

Action required:

  • Please review if the changes in nemo_rl/models/dtensor/parallelize.py should also be applied to 3rdparty/Automodel-workspace/Automodel/nemo_automodel/components/distributed/optimized_tp_plans.py or 3rdparty/Automodel-workspace/Automodel/nemo_automodel/components/distributed/parallelizer.py
  • Update the appropriate related file(s) if necessary to maintain functional consistency
  • Request access to the NVIDIA-NeMo/Automodel repository, create a PR against the nemo-rl-submodule branch, and update the Automodel submodule in the nemo-rl index
  • Add @ffrujeri as a reviewer of this PR if you have any questions about the consistency requirements
  • If the files are intentionally different, please add a comment in the PR explaining why

Files to check:

  • Modified: nemo_rl/models/dtensor/parallelize.py
  • Not modified: 3rdparty/Automodel-workspace/Automodel/nemo_automodel/components/distributed/optimized_tp_plans.py
  • Not modified: 3rdparty/Automodel-workspace/Automodel/nemo_automodel/components/distributed/parallelizer.py

✅ DTensor Policy Worker Synchronization Check

Both DTensor policy worker files were modified in this PR:

  • nemo_rl/models/policy/dtensor_policy_worker.py
  • nemo_rl/models/policy/dtensor_policy_worker_v2.py

Please ensure that the changes are consistent between both files where applicable.


This check ensures that related file implementations remain synchronized across the codebase. If you believe this warning is incorrect or the files should intentionally differ, please add a comment explaining the reasoning.

…up for vlm grpo

- Added `env_name` to `vlm_grpo_3B_megatron.yaml` and `vlm_grpo_3B.yaml` for environment specification.
- Modified `setup_data` function in `run_vlm_grpo.py` to use `env_name` for environment configuration, enhancing flexibility in dataset processing.

Signed-off-by: ruit <[email protected]>
@RayenTian RayenTian requested a review from a team as a code owner November 21, 2025 06:46
@github-actions
Copy link

⚠️ File Consistency Check

Check based on commit: 6e4393e (PR #1506 from yukih/pr-1472)

⚠️ Parallel Plans Synchronization Warning

The file nemo_rl/models/dtensor/parallelize.py was modified in this PR, but neither 3rdparty/Automodel-workspace/Automodel/nemo_automodel/components/distributed/optimized_tp_plans.py nor 3rdparty/Automodel-workspace/Automodel/nemo_automodel/components/distributed/parallelizer.py was updated.

Why this matters:
These files contain similar parallel plan implementations that should be kept synchronized to ensure consistency across the codebase.

Action required:

  • Please review if the changes in nemo_rl/models/dtensor/parallelize.py should also be applied to 3rdparty/Automodel-workspace/Automodel/nemo_automodel/components/distributed/optimized_tp_plans.py or 3rdparty/Automodel-workspace/Automodel/nemo_automodel/components/distributed/parallelizer.py
  • Update the appropriate related file(s) if necessary to maintain functional consistency
  • Request access to the NVIDIA-NeMo/Automodel repository, create a PR against the nemo-rl-submodule branch, and update the Automodel submodule in the nemo-rl index
  • Add @ffrujeri as a reviewer of this PR if you have any questions about the consistency requirements
  • If the files are intentionally different, please add a comment in the PR explaining why

Files to check:

  • Modified: nemo_rl/models/dtensor/parallelize.py
  • Not modified: 3rdparty/Automodel-workspace/Automodel/nemo_automodel/components/distributed/optimized_tp_plans.py
  • Not modified: 3rdparty/Automodel-workspace/Automodel/nemo_automodel/components/distributed/parallelizer.py

✅ DTensor Policy Worker Synchronization Check

Both DTensor policy worker files were modified in this PR:

  • nemo_rl/models/policy/dtensor_policy_worker.py
  • nemo_rl/models/policy/dtensor_policy_worker_v2.py

Please ensure that the changes are consistent between both files where applicable.


This check ensures that related file implementations remain synchronized across the codebase. If you believe this warning is incorrect or the files should intentionally differ, please add a comment explaining the reasoning.

@RayenTian RayenTian added CI:L2 Run doctests, unit tests, functional tests, and convergence tests and removed CI:L1 Run doctests, unit tests, and functional tests labels Nov 21, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CI:L2 Run doctests, unit tests, functional tests, and convergence tests documentation Improvements or additions to documentation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants