recipes doc by shengliangxu · Pull Request #1165 · NVIDIA/Model-Optimizer

shengliangxu · 2026-04-02T00:07:16Z

What does this PR do?

Type of change: ?

Usage

# Add a code snippet demonstrating how to use this

Testing

Before your PR is "Ready for review"

Make sure you read and follow Contributor guidelines and your commits are signed (git commit -s -S).

Make sure you read and follow the Security Best Practices (e.g. avoiding hardcoded trust_remote_code=True, torch.load(..., weights_only=False), pickle, etc.).

Is this change backward compatible?: ✅ / ❌ / N/A
If you copied code from any other sources or added a new PIP dependency, did you follow guidance in CONTRIBUTING.md: ✅ / ❌ / N/A
Did you write any new necessary tests?: ✅ / ❌ / N/A
Did you update Changelog?: ✅ / ❌ / N/A

Additional Information

Right now the quant_cfg is a dict, but we are using the quant_cfg as if it is a list. When we apply the quant_cfg, we enumerate the items in the dict and apply the config one by one in modelopt/torch/quantization/conversion.py. This implementation actually has the semantic that the latter configs has higher precedence than the former configs. However, dicts do not have reliable ordering. Therefore, we make quant_cfg a list of patterns: 1. The latter config patterns have higher precedence. A latter config in the list overrides a fomer config if they target the same module. 2. A config to each module is atomic, each config provides the full information. We do not compose a quant module config from multiple config lines Signed-off-by: Shengliang Xu <shengliangx@nvidia.com>

Signed-off-by: Shengliang Xu <shengliangx@nvidia.com>

…s_partial set_quantizer_attributes_full updates the full quantizer attributes, it has the atomic semantic set_quantizer_attributes_partial updates just a partial set of quantizer attributes, it has the merge semantic Signed-off-by: Shengliang Xu <shengliangx@nvidia.com>

Signed-off-by: Shengliang Xu <shengliangx@nvidia.com>

…-list

Signed-off-by: Shengliang Xu <shengliangx@nvidia.com>

copy-pr-bot · 2026-04-02T00:07:20Z

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

coderabbitai · 2026-04-02T00:07:26Z

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 7f7e4316-e769-469b-8180-69ba71531598

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

🔍 Trigger review

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch shengliangx/recipes-doc

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

github-actions · 2026-04-02T00:12:06Z

PR Preview Action v1.8.1
🚀 View preview at https://NVIDIA.github.io/Model-Optimizer/pr-preview/pr-1165/
Built to branch `gh-pages` at 2026-04-02 00:31 UTC. Preview will be ready when the GitHub Pages deployment is complete.

codecov · 2026-04-02T00:19:58Z

Codecov Report

❌ Patch coverage is 84.79730% with 45 lines in your changes missing coverage. Please review.
✅ Project coverage is 70.55%. Comparing base (de55e8a) to head (8a961e3).

Files with missing lines	Patch %	Lines
modelopt/torch/quantization/conversion.py	78.78%	21 Missing ⚠️
...delopt/onnx/llm_export_utils/quantization_utils.py	50.00%	5 Missing ⚠️
modelopt/torch/quantization/algorithms.py	89.74%	4 Missing ⚠️
modelopt/torch/quantization/config.py	96.07%	4 Missing ⚠️
modelopt/torch/quantization/utils/core_utils.py	42.85%	4 Missing ⚠️
...torch/quantization/backends/fp8_per_tensor_gemm.py	70.00%	3 Missing ⚠️
modelopt/torch/quantization/backends/nvfp4_gemm.py	66.66%	3 Missing ⚠️
modelopt/torch/quantization/compress.py	66.66%	1 Missing ⚠️

Additional details and impacted files

@@             Coverage Diff             @@
##             main    #1165       +/-   ##
===========================================
+ Coverage   54.54%   70.55%   +16.00%     
===========================================
  Files         348      349        +1     
  Lines       39766    40021      +255     
===========================================
+ Hits        21691    28237     +6546     
+ Misses      18075    11784     -6291

Flag	Coverage Δ
unit	`54.61% <76.01%> (+0.06%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

New _recipes.rst covers recipe file format, built-in recipes, loading API, ExMy notation, path resolution, and future directions. Signed-off-by: Shengliang Xu <shengliangx@nvidia.com>

shengliangxu and others added 30 commits March 22, 2026 23:44

Make quant_cfg a list of tuples, dict is too much

d99e4ae

Signed-off-by: Shengliang Xu <shengliangx@nvidia.com>

yaml config format update

b5bea21

Signed-off-by: Shengliang Xu <shengliangx@nvidia.com>

fix some extra quant_cfg

1b8c4bf

Signed-off-by: Shengliang Xu <shengliangx@nvidia.com>

fix tests

ab4daec

Signed-off-by: Shengliang Xu <shengliangx@nvidia.com>

rename from format to cfg

4ffd2fa

Signed-off-by: Shengliang Xu <shengliangx@nvidia.com>

pattern to path

d599103

Signed-off-by: Shengliang Xu <shengliangx@nvidia.com>

flatten the inner configs

fc53877

Signed-off-by: Shengliang Xu <shengliangx@nvidia.com>

get rid of the special 'default'

a19335f

Signed-off-by: Shengliang Xu <shengliangx@nvidia.com>

remove default

04014ec

Signed-off-by: Shengliang Xu <shengliangx@nvidia.com>

match yaml file format

22134ef

Signed-off-by: Shengliang Xu <shengliangx@nvidia.com>

fix tests

f52d213

Signed-off-by: Shengliang Xu <shengliangx@nvidia.com>

fix guide

8f59142

Signed-off-by: Shengliang Xu <shengliangx@nvidia.com>

default to disable

3cda60f

Signed-off-by: Shengliang Xu <shengliangx@nvidia.com>

tuple format is not needed, remove all of them

43f9a1a

Signed-off-by: Shengliang Xu <shengliangx@nvidia.com>

final remove tuple format

4549001

Signed-off-by: Shengliang Xu <shengliangx@nvidia.com>

add atomicity to doc

30bb041

Signed-off-by: Shengliang Xu <shengliangx@nvidia.com>

fix more quant_cfg args

ff9fdd9

Signed-off-by: Shengliang Xu <shengliangx@nvidia.com>

new partial set quantizer cfg for internal merging logic

dc915f5

Signed-off-by: Shengliang Xu <shengliangx@nvidia.com>

enable semantic documentation

10c4cdd

Signed-off-by: Shengliang Xu <shengliangx@nvidia.com>

revert accidental test change

a03d975

Signed-off-by: Shengliang Xu <shengliangx@nvidia.com>

fix mypy

fb3bb07

Signed-off-by: Shengliang Xu <shengliangx@nvidia.com>

new tests and fix existing tests

aecf832

Signed-off-by: Shengliang Xu <shengliangx@nvidia.com>

python < 3.12

5115452

Signed-off-by: Shengliang Xu <shengliangx@nvidia.com>

more fix dict to list

a481bd1

Signed-off-by: Shengliang Xu <shengliangx@nvidia.com>

KV config has only quant_cfg meaningful

fe2d2f3

Signed-off-by: Shengliang Xu <shengliangx@nvidia.com>

Merge branch 'main' into shengliangx/quant_cfg-list

3a3b112

fix tests

b9d67d3

Signed-off-by: Shengliang Xu <shengliangx@nvidia.com>

Merge branch 'main' into shengliangx/quant_cfg-list

823d602

shengliangxu and others added 23 commits March 26, 2026 00:33

fix: entry is a dict

9bcd06e

Signed-off-by: Shengliang Xu <shengliangx@nvidia.com>

fix megatron tests

2721483

Signed-off-by: Shengliang Xu <shengliangx@nvidia.com>

fix deepseek example semantic

9752f05

Signed-off-by: Shengliang Xu <shengliangx@nvidia.com>

more fixes

cd65849

Signed-off-by: Shengliang Xu <shengliangx@nvidia.com>

Merge remote-tracking branch 'origin/main' into shengliangx/quant_cfg…

113a035

…-list

convert new yaml file

aa2a881

Signed-off-by: Shengliang Xu <shengliangx@nvidia.com>

more format fixes

c5ff747

Signed-off-by: Shengliang Xu <shengliangx@nvidia.com>

Merge branch 'main' into shengliangx/quant_cfg-list

a505579

Merge branch 'main' into shengliangx/quant_cfg-list

bf26d30

fix review comments

26d46f5

Signed-off-by: Shengliang Xu <shengliangx@nvidia.com>

Merge branch 'main' into shengliangx/quant_cfg-list

6a92c16

more tests and fixes

b71c80b

Signed-off-by: Shengliang Xu <shengliangx@nvidia.com>

more updates and fixes

792efc7

Signed-off-by: Shengliang Xu <shengliangx@nvidia.com>

more fixes

bee2c9d

Signed-off-by: Shengliang Xu <shengliangx@nvidia.com>

more improvements

f9122b7

Signed-off-by: Shengliang Xu <shengliangx@nvidia.com>

more fixes and more tests

6018fb0

Signed-off-by: Shengliang Xu <shengliangx@nvidia.com>

more fixes

f034b43

Signed-off-by: Shengliang Xu <shengliangx@nvidia.com>

More improvements

54823a3

Signed-off-by: Shengliang Xu <shengliangx@nvidia.com>

more improvments

2bba55a

Signed-off-by: Shengliang Xu <shengliangx@nvidia.com>

even more fixes and improvements

6418b26

Signed-off-by: Shengliang Xu <shengliangx@nvidia.com>

more improvements, using copy

1b6b291

Signed-off-by: Shengliang Xu <shengliangx@nvidia.com>

Merge branch 'main' into shengliangx/quant_cfg-list

f44a8c7

attempt to fix windows unit test failure

ac353e2

Signed-off-by: Shengliang Xu <shengliangx@nvidia.com>

Add recipes system documentation guide

8a961e3

New _recipes.rst covers recipe file format, built-in recipes, loading API, ExMy notation, path resolution, and future directions. Signed-off-by: Shengliang Xu <shengliangx@nvidia.com>

shengliangxu force-pushed the shengliangx/recipes-doc branch from eea1051 to 8a961e3 Compare April 2, 2026 00:27

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

recipes doc#1165

recipes doc#1165
shengliangxu wants to merge 54 commits intomainfrom
shengliangx/recipes-doc

shengliangxu commented Apr 2, 2026

Uh oh!

copy-pr-bot bot commented Apr 2, 2026

Uh oh!

coderabbitai bot commented Apr 2, 2026 •

edited

Loading

Review skipped

Uh oh!

github-actions bot commented Apr 2, 2026 •

edited

Loading

Built to branch `gh-pages` at 2026-04-02 00:31 UTC.
Preview will be ready when the GitHub Pages deployment is complete.

Uh oh!

codecov bot commented Apr 2, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

shengliangxu commented Apr 2, 2026

What does this PR do?

Usage

Testing

Before your PR is "Ready for review"

Additional Information

Uh oh!

copy-pr-bot bot commented Apr 2, 2026

Uh oh!

coderabbitai bot commented Apr 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review skipped

Uh oh!

github-actions bot commented Apr 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Built to branch gh-pages at 2026-04-02 00:31 UTC. Preview will be ready when the GitHub Pages deployment is complete.

Uh oh!

codecov bot commented Apr 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

coderabbitai bot commented Apr 2, 2026 •

edited

Loading

github-actions bot commented Apr 2, 2026 •

edited

Loading

Built to branch `gh-pages` at 2026-04-02 00:31 UTC.
Preview will be ready when the GitHub Pages deployment is complete.

codecov bot commented Apr 2, 2026 •

edited

Loading