Draft: merge any_model: mip_and_realize_models#995
Draft: merge any_model: mip_and_realize_models#995danielkorzekwa merged 47 commits intofeature/puzzletronfrom
Conversation
- Add converter, model_descriptor, puzzformer, and llama model support - Selective merge of anymodel functionality Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>
Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>
Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>
Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>
Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>
Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>
Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>
Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>
Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>
Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>
Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>
…s merged) Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>
Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>
Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>
Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>
Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>
Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>
Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>
Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>
Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>
Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>
📝 WalkthroughWalkthroughThis PR modifies the Puzzletron model optimization pipeline by changing how key-value head counts are computed from block configuration, enabling the MIP realization step in the main pipeline, and replacing disabled test assertions with active validation checks for post-processing artifacts. Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~12 minutes 🚥 Pre-merge checks | ✅ 3 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
📝 Coding Plan
Comment |
Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>
Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>
Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>
Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>
Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>
Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>
Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>
…ld_library_and_stats
Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>
…kwa/any_model_calc_one_block_scores
…wa/mip_and_realize_models
Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>
…ld_library_and_stats
…kwa/any_model_calc_one_block_scores
…wa/mip_and_realize_models
Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>
Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>
There was a problem hiding this comment.
🧹 Nitpick comments (2)
tests/gpu/torch/puzzletron/test_puzzletron.py (1)
144-152: Consider defensive handling for missing directory.If
mip_solutions_dirdoesn't exist (e.g., due to an earlier step failure),iterdir()will raiseFileNotFoundError. A more explicit check could improve debuggability.Optional: Add existence check before iteration
# assertions for the mip_and_realize_models step 6 # Find the MIP solution directory dynamically (e.g., stats_num_local_experts_*) mip_solutions_dir = puzzle_dir / "mip/puzzle_solutions" + assert mip_solutions_dir.exists(), ( + f"MIP solutions directory not found: {mip_solutions_dir}" + ) solution_dirs = [ d for d in mip_solutions_dir.iterdir() if d.is_dir() and d.name.startswith("stats_num_local_experts_") ]🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/gpu/torch/puzzletron/test_puzzletron.py` around lines 144 - 152, The code assumes mip_solutions_dir exists before calling mip_solutions_dir.iterdir(); add a defensive existence check (e.g., if not mip_solutions_dir.exists() or not mip_solutions_dir.is_dir()) before iterating and raise/assert with a clear error message that includes puzzle_dir and mip_solutions_dir when missing, or set solution_dirs to an empty list and fail the subsequent assertion with that informative message; update the block around mip_solutions_dir, solution_dirs and the following assert to use this check so FileNotFoundError is avoided and debugging is easier.modelopt/torch/puzzletron/puzzletron.py (1)
65-74: Minor: Step numbering skips Step 3.The comments indicate Steps 0, 1, 2, 4, 5, 6 — Step 3 is missing. This might be intentional (perhaps a removed step), but it could cause confusion when referencing steps in logs or documentation.
Consider renumbering for clarity
- # Step 4: build_library_and_stats (single process) + # Step 3: build_library_and_stats (single process) if dist.is_master(): build_library_and_stats.launch_build_library_and_stats(hydra_cfg) dist.barrier() - # Step 5: calc_one_block_scores (distributed processing) + # Step 4: calc_one_block_scores (distributed processing) scoring.launch_scoring(hydra_cfg) - # Step 6: mip_and_realize_models (distributed processing) + # Step 5: mip_and_realize_models (distributed processing) mip_and_realize_models.launch_mip_and_realize_model(hydra_cfg)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@modelopt/torch/puzzletron/puzzletron.py` around lines 65 - 74, The step comments in puzzletron.py skip Step 3 which is confusing; update the comments around the sequence that includes dist.is_master(), build_library_and_stats.launch_build_library_and_stats, dist.barrier(), scoring.launch_scoring, and mip_and_realize_models.launch_mip_and_realize_model to either renumber them sequentially (e.g., change "Step 4"→"Step 3", "Step 5"→"Step 4", "Step 6"→"Step 5") or add a brief comment like "Step 3 removed intentionally" to make the gap explicit and avoid confusion when referencing steps in logs/docs.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@modelopt/torch/puzzletron/puzzletron.py`:
- Around line 65-74: The step comments in puzzletron.py skip Step 3 which is
confusing; update the comments around the sequence that includes
dist.is_master(), build_library_and_stats.launch_build_library_and_stats,
dist.barrier(), scoring.launch_scoring, and
mip_and_realize_models.launch_mip_and_realize_model to either renumber them
sequentially (e.g., change "Step 4"→"Step 3", "Step 5"→"Step 4", "Step 6"→"Step
5") or add a brief comment like "Step 3 removed intentionally" to make the gap
explicit and avoid confusion when referencing steps in logs/docs.
In `@tests/gpu/torch/puzzletron/test_puzzletron.py`:
- Around line 144-152: The code assumes mip_solutions_dir exists before calling
mip_solutions_dir.iterdir(); add a defensive existence check (e.g., if not
mip_solutions_dir.exists() or not mip_solutions_dir.is_dir()) before iterating
and raise/assert with a clear error message that includes puzzle_dir and
mip_solutions_dir when missing, or set solution_dirs to an empty list and fail
the subsequent assertion with that informative message; update the block around
mip_solutions_dir, solution_dirs and the following assert to use this check so
FileNotFoundError is avoided and debugging is easier.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 7ed33a66-d1fb-4928-ad4d-611f6fb09e49
📒 Files selected for processing (4)
modelopt/torch/puzzletron/mip/run_puzzle.pymodelopt/torch/puzzletron/puzzletron.pymodelopt/torch/puzzletron/tools/validate_model.pytests/gpu/torch/puzzletron/test_puzzletron.py
💤 Files with no reviewable changes (1)
- modelopt/torch/puzzletron/tools/validate_model.py
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## feature/puzzletron #995 +/- ##
======================================================
- Coverage 72.13% 72.12% -0.02%
======================================================
Files 209 209
Lines 23628 23628
======================================================
- Hits 17045 17042 -3
- Misses 6583 6586 +3 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>
### What does this PR do? Implement puzzletron compression algorithm based on Puzzle paper (https://arxiv.org/abs/2411.19146) <details> <summary> Th list of reviewed and merged MRs that resulted in the feature/puzzletron branch</summary> Merging dkorzekwa/any_model to feature/puzzletron [Add anymodel directories to feature/puzzletron by danielkorzekwa · Pull Request #974 · NVIDIA/Model-Optimizer](#974) - merged [Draft: anymodel activation scoring by danielkorzekwa · Pull Request #989 · NVIDIA/Model-Optimizer](#989) - merged [Draft: Merge anymodel pruning by danielkorzekwa · Pull Request #990 · NVIDIA/Model-Optimizer](#990) - merged [Draft: Merging anymodel:build_library_and_stats by danielkorzekwa · Pull Request #993 · NVIDIA/Model-Optimizer](#993) - merged [Dkorzekwa/any model calc one block scores by danielkorzekwa · Pull Request #994 · NVIDIA/Model-Optimizer](#994) - merged [Draft: merge any_model: mip_and_realize_models by danielkorzekwa · Pull Request #995 · NVIDIA/Model-Optimizer](#995) - merged [Dkorzekwa/any model other modeqls by danielkorztiekwa · Pull Request #1007 · NVIDIA/Model-Optimizer](#1007) - merged PR to 1007: #1039 - merged [Dkorzekwa/anymodel gptoss by danielkorzekwa · Pull Request #1020 · NVIDIA/Model-Optimizer](#1020) - merged [Merge any_model tutorial by danielkorzekwa · Pull Request #1035 · NVIDIA/Model-Optimizer](#1035) - merged [Merge mbridge distillation for any_model by danielkorzekwa · Pull Request #1036 · NVIDIA/Model-Optimizer](#1036) - merged [MR branch for the remaining difference between dkorzekwa/any_model an… by danielkorzekwa · Pull Request #1047 · NVIDIA/Model-Optimizer](#1047) - merged [Dkorzekwa/decilm hf code cleanup by danielkorzekwa · Pull Request #1071 · NVIDIA/Model-Optimizer](#1071) - merged [Dkorzekwa/decilm hf code cleanup 2 by danielkorzekwa · Pull Request #1073 · NVIDIA/Model-Optimizer](#1073) - merged [Dkorzekwa/anymodel subblock stats by danielkorzekwa · Pull Request #1085 · NVIDIA/Model-Optimizer](#1085) - merged [Dkorzekwa/anymodel subblock stats nodecilm by danielkorzekwa · Pull Request #1102 · NVIDIA/Model-Optimizer](#1102) - merged [Dkorzekwa/decilm cleanup post subblockstats by danielkorzekwa · Pull Request #1103 · NVIDIA/Model-Optimizer](#1103) - merged [code clean up by danielkorzekwa · Pull Request #1110 · NVIDIA/Model-Optimizer](#1110) - merged Merging into main: [Activation hooks redesign (reuse hooks component across both minitron and puzzletron) by danielkorzekwa · Pull Request #1022 · NVIDIA/Model-Optimizer](#1022) - merged [Dkorzekwa/puzzletron use importance hooks from prune by danielkorzekwa · Pull Request #1115 · NVIDIA/Model-Optimizer](#1115) - merged </details> <!-- Details about the change. --> ### Usage Puzzletron tutorial: https://github.com/NVIDIA/Model-Optimizer/tree/feature/puzzletron/examples/puzzletron ### Testing The main e2e test for compressing 9 models with Puzzletron: https://github.com/NVIDIA/Model-Optimizer/blob/feature/puzzletron/tests/gpu/torch/puzzletron/test_puzzletron.py 2-gpu nightly tests: - https://github.com/NVIDIA/Model-Optimizer/actions/runs/24468209205/job/71501061203 - https://github.com/NVIDIA/Model-Optimizer/actions/runs/24470214159/job/71508152952 ### Before your PR is "*Ready for review*" - Is this change backward compatible?: ✅ - If you copied code from any other sources or added a new PIP dependency, did you follow guidance in `CONTRIBUTING.md`: ✅ - Did you write any new necessary tests?: ✅ - Did you update [Changelog](https://github.com/NVIDIA/Model-Optimizer/blob/main/CHANGELOG.rst)?: ✅ <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * **New Features** * Added Puzzletron: end-to-end heterogeneous pruning & NAS workflow with AnyModel support, example pipelines, deployment and evaluation utilities, and tools for converting/pruning and exporting compressed checkpoints. * **Documentation** * Comprehensive Puzzletron tutorials, model-specific guides, evaluator instructions, example configs, and changelog entry. * **Chores** * CI/workflow updates (extras installation, longer GPU test timeout), pre-commit hook exclusion updated, and CODEOWNERS entries added. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Signed-off-by: Keval Morabia <28916987+kevalmorabia97@users.noreply.github.com> Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com> Signed-off-by: Liana Mikaelyan <lmikaelyan@nvidia.com> Signed-off-by: Liana Mikaelyan <45925959+LianaMikael@users.noreply.github.com> Signed-off-by: Daniel Korzekwa <daniel.korzekwa@gmail.com> Signed-off-by: jrausch <jrausch@nvidia.com> Signed-off-by: root <root@pool0-00848.cm.cluster> Co-authored-by: Keval Morabia <28916987+kevalmorabia97@users.noreply.github.com> Co-authored-by: Liana Mikaelyan <lmikaelyan@nvidia.com> Co-authored-by: Liana Mikaelyan <45925959+LianaMikael@users.noreply.github.com> Co-authored-by: J Rausch <38429553+j-rausch@users.noreply.github.com> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
What does this PR do?
Merging dkorzekwa/mip_and_realize_models into dkorzekwa/any_model_calc_one_block_scores - this MR is only for reviewing. Ultimately dkorzekwa/mip_and_realize_models should be merged into feature/puzzletron once dkorzekwa/any_model_calc_one_block_scores is merged there.
Summary by CodeRabbit
Release Notes
New Features
Bug Fixes
Tests
Chores