Skip to content

Conversation

@jstjohn
Copy link
Collaborator

@jstjohn jstjohn commented Jan 21, 2026

Description

Usage

TODO: Add code snippet

Type of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Refactor
  • Documentation update
  • Other (please describe):

CI Pipeline Configuration

Configure CI behavior by applying the relevant labels. By default, only basic unit tests are run.

  • ciflow:skip - Skip all CI tests for this PR
  • ciflow:notebooks - Run Jupyter notebooks execution tests for bionemo2
  • ciflow:slow - Run slow single GPU integration tests marked as @pytest.mark.slow for bionemo2
  • ciflow:all - Run all tests (unit tests, slow tests, and notebooks) for bionemo2. This label can be used to enforce running tests for all bionemo2.
  • ciflow:all-recipes - Run tests for all recipes (under bionemo-recipes). This label can be used to enforce running tests for all recipes.

Unit tests marked as @pytest.mark.multi_gpu or @pytest.mark.distributed are not run in the PR pipeline.

For more details, see CONTRIBUTING

Note

By default, only basic unit tests are run. Add appropriate labels to enable an additional test coverage.

Authorizing CI Runs

We use copy-pr-bot to manage authorization of CI
runs on NVIDIA's compute resources.

  • If a pull request is opened by a trusted user and contains only trusted changes, the pull request's code will
    automatically be copied to a pull-request/ prefixed branch in the source repository (e.g. pull-request/123)
  • If a pull request is opened by an untrusted user or contains untrusted changes, an NVIDIA org member must leave an
    /ok to test comment on the pull request to trigger CI. This will need to be done for each new commit.

Triggering Code Rabbit AI Review

To trigger a code review from code rabbit, comment on a pull request with one of these commands:

See https://docs.coderabbit.ai/reference/review-commands for a full list of commands.

Pre-submit Checklist

  • I have tested these changes locally
  • I have updated the documentation accordingly
  • I have added/updated tests as needed
  • All existing tests pass successfully

Summary by CodeRabbit

Release Notes

  • New Features

    • Added no_weight_decay_embeddings configuration parameter for Evo2 training recipes to control embedding weight decay behavior.
  • Chores

    • Updated Megatron-related dependency versions.
  • Tests

    • Improved test fixture scoping for better test isolation.

✏️ Tip: You can customize this high-level summary in your review settings.

@copy-pr-bot
Copy link

copy-pr-bot bot commented Jan 21, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 21, 2026

Important

Review skipped

Auto reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

📝 Walkthrough

Walkthrough

The PR updates Megatron dependency versions and refactors weight-decay embedding handling in Evo2 by introducing a dedicated HyenaOptimizerConfigOverrideProvider class to manage optimizer parameter-specific overrides, replacing the previous scheduler-based configuration approach with a recipe-integrated solution.

Changes

Cohort / File(s) Summary
Dependency Updates
bionemo-recipes/recipes/evo2_megatron/pyproject.toml
Updated Git revisions for megatron-bridge and megatron-core from 18ef1b61309dd45bc0535fb7c60064b9d8829a35 to c415f4616340a0431c0eae776d0482ab5cc3770e in UV sources map.
Optimizer Provider Implementation
bionemo-recipes/recipes/evo2_megatron/src/bionemo/evo2/models/evo2_provider.py
Added HyenaOptimizerConfigOverrideProvider class implementing build_config_overrides() to emit weight-decay and optional decoupled LR overrides for 1D parameters. Updated imports to support optimizer configuration framework.
Recipe Configuration Integration
bionemo-recipes/recipes/evo2_megatron/src/bionemo/evo2/recipes/evo2.py
Added no_weight_decay_embeddings parameter to Evo2CommonKwargs and _evo2_common(); instantiated HyenaOptimizerConfigOverrideProvider in ConfigContainer; removed scheduler-based weight-decay configuration; imported new provider class.
Training CLI Updates
bionemo-recipes/recipes/evo2_megatron/src/bionemo/evo2/run/train.py
Removed hyena_no_weight_decay_cond_with_embeddings import; propagated no_weight_decay_embeddings via recipe kwargs instead of runtime scheduler modification.
Test Fixture Scope Changes
bionemo-recipes/recipes/evo2_megatron/tests/bionemo/evo2/run/test_train.py
Transitioned mbridge_checkpoint_7b_1m and base_checkpoint fixtures from session-scoped to module-scoped; updated temporary directory naming and docstrings accordingly.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~28 minutes

Poem

🐰 A bundled approach, so clean and so tight,
Where optimizers dance with overrides right,
No more scheduler tricks in the shadows to hide,
Just config providers with structured pride! ✨

🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 inconclusive)
Check name Status Explanation Resolution
Description check ❓ Inconclusive The PR description covers the key aspects (refactor type, no user-level API changes, dependency on mbridge PR) but the usage code snippet is marked TODO and pre-submit checklist items are unchecked, indicating incomplete preparation. Complete the TODO usage code snippet and verify the pre-submit checklist items (local testing, documentation, tests, existing tests passing).
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and specifically describes the main refactoring effort: updating weight decay skipping for Hyena to align with Megatron-Bridge's current approach.
Docstring Coverage ✅ Passed Docstring coverage is 81.82% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@jstjohn
Copy link
Collaborator Author

jstjohn commented Jan 21, 2026

Depends on NVIDIA-NeMo/Megatron-Bridge#2010

@jstjohn
Copy link
Collaborator Author

jstjohn commented Jan 21, 2026

/ok to test dcadb56

@jstjohn
Copy link
Collaborator Author

jstjohn commented Jan 21, 2026

@coderabbitai review

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 21, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@jstjohn
Copy link
Collaborator Author

jstjohn commented Jan 21, 2026

/ok to test e71d3a3

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In `@bionemo-recipes/recipes/evo2_megatron/src/bionemo/evo2/recipes/evo2.py`:
- Around line 246-248: The HyenaOptimizerConfigOverrideProvider is hardcoding
no_weight_decay_embeddings=True which ignores the recipe default and CLI flag;
change the call to forward the actual configured value instead of True (e.g.,
no_weight_decay_embeddings=no_weight_decay_embeddings or
no_weight_decay_embeddings=self.config.optimizer.no_weight_decay_embeddings /
the passed-in optimizer flag) so the provider respects the recipe/CLI setting;
update the invocation of HyenaOptimizerConfigOverrideProvider to use the
existing config/parameter name rather than a literal True.

In `@bionemo-recipes/recipes/evo2_megatron/tests/bionemo/evo2/run/test_train.py`:
- Around line 535-539: The docstring for the fixture mbridge_checkpoint_7b_1m
still says "Session-scoped" even though the fixture scope is module; update the
docstring to say "Module-scoped" (or remove the scope mention) and adjust the
following sentence if needed to reflect that the fixture exists for the duration
of the module's tests so it no longer implies session lifetime.

@jstjohn
Copy link
Collaborator Author

jstjohn commented Jan 21, 2026

/ok to test 01f238c

@jstjohn
Copy link
Collaborator Author

jstjohn commented Jan 22, 2026

/ok to test 1f7235d

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant