Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
54 commits
Select commit Hold shift + click to select a range
5d9c272
quant_cfg as a list
shengliangxu Mar 17, 2026
d99e4ae
Make quant_cfg a list of tuples, dict is too much
shengliangxu Mar 18, 2026
b5bea21
yaml config format update
shengliangxu Mar 18, 2026
1b8c4bf
fix some extra quant_cfg
shengliangxu Mar 18, 2026
ab4daec
fix tests
shengliangxu Mar 19, 2026
4ffd2fa
rename from format to cfg
shengliangxu Mar 19, 2026
d599103
pattern to path
shengliangxu Mar 19, 2026
fc53877
flatten the inner configs
shengliangxu Mar 19, 2026
a19335f
get rid of the special 'default'
shengliangxu Mar 19, 2026
04014ec
remove default
shengliangxu Mar 19, 2026
22134ef
match yaml file format
shengliangxu Mar 20, 2026
f52d213
fix tests
shengliangxu Mar 20, 2026
8f59142
fix guide
shengliangxu Mar 20, 2026
3cda60f
default to disable
shengliangxu Mar 20, 2026
43f9a1a
tuple format is not needed, remove all of them
shengliangxu Mar 20, 2026
4549001
final remove tuple format
shengliangxu Mar 20, 2026
30bb041
add atomicity to doc
shengliangxu Mar 20, 2026
ff9fdd9
fix more quant_cfg args
shengliangxu Mar 20, 2026
a164f13
distinguish set_quantizer_attributes_full and set_quantizer_attribute…
shengliangxu Mar 21, 2026
dc915f5
new partial set quantizer cfg for internal merging logic
shengliangxu Mar 22, 2026
10c4cdd
enable semantic documentation
shengliangxu Mar 22, 2026
a03d975
revert accidental test change
shengliangxu Mar 22, 2026
fb3bb07
fix mypy
shengliangxu Mar 22, 2026
aecf832
new tests and fix existing tests
shengliangxu Mar 23, 2026
5115452
python < 3.12
shengliangxu Mar 23, 2026
a481bd1
more fix dict to list
shengliangxu Mar 23, 2026
fe2d2f3
KV config has only quant_cfg meaningful
shengliangxu Mar 23, 2026
3a3b112
Merge branch 'main' into shengliangx/quant_cfg-list
shengliangxu Mar 23, 2026
b9d67d3
fix tests
shengliangxu Mar 25, 2026
823d602
Merge branch 'main' into shengliangx/quant_cfg-list
shengliangxu Mar 25, 2026
9bcd06e
fix: entry is a dict
shengliangxu Mar 26, 2026
2721483
fix megatron tests
shengliangxu Mar 26, 2026
9752f05
fix deepseek example semantic
shengliangxu Mar 26, 2026
cd65849
more fixes
shengliangxu Mar 26, 2026
113a035
Merge remote-tracking branch 'origin/main' into shengliangx/quant_cfg…
shengliangxu Mar 31, 2026
aa2a881
convert new yaml file
shengliangxu Mar 31, 2026
c5ff747
more format fixes
shengliangxu Mar 31, 2026
a505579
Merge branch 'main' into shengliangx/quant_cfg-list
shengliangxu Apr 1, 2026
bf26d30
Merge branch 'main' into shengliangx/quant_cfg-list
shengliangxu Apr 1, 2026
26d46f5
fix review comments
shengliangxu Apr 1, 2026
6a92c16
Merge branch 'main' into shengliangx/quant_cfg-list
shengliangxu Apr 1, 2026
b71c80b
more tests and fixes
shengliangxu Apr 1, 2026
792efc7
more updates and fixes
shengliangxu Apr 1, 2026
bee2c9d
more fixes
shengliangxu Apr 1, 2026
f9122b7
more improvements
shengliangxu Apr 1, 2026
6018fb0
more fixes and more tests
shengliangxu Apr 1, 2026
f034b43
more fixes
shengliangxu Apr 1, 2026
54823a3
More improvements
shengliangxu Apr 1, 2026
2bba55a
more improvments
shengliangxu Apr 1, 2026
6418b26
even more fixes and improvements
shengliangxu Apr 1, 2026
1b6b291
more improvements, using copy
shengliangxu Apr 1, 2026
f44a8c7
Merge branch 'main' into shengliangx/quant_cfg-list
shengliangxu Apr 1, 2026
ac353e2
attempt to fix windows unit test failure
shengliangxu Apr 1, 2026
8a961e3
Add recipes system documentation guide
shengliangxu Apr 2, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions docs/source/guides/1_quantization.rst
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,8 @@ Below, you can find the documentation for the quantization toolkit in ModelOpt:
./_basic_quantization.rst
./_choosing_quant_methods.rst
./_pytorch_quantization.rst
./_quant_cfg.rst
./_recipes.rst
./_customized_model_quantization.rst
./_compress_quantized_models.rst
./_onnx_quantization.rst
Expand Down
33 changes: 21 additions & 12 deletions docs/source/guides/_pytorch_quantization.rst
Original file line number Diff line number Diff line change
Expand Up @@ -237,14 +237,16 @@ For debugging purposes or simple customizations, you can modify an existing conf

.. code-block:: python

# Create a copy of the default INT8 configuration
config = mtq.INT8_DEFAULT_CFG.copy()
import copy

# Disable input quantizers for all layers
config["quant_cfg"]["*input_quantizer"]["enable"] = False
# Create a deep copy of the default INT8 configuration
config = copy.deepcopy(mtq.INT8_DEFAULT_CFG)

# Disable input quantizers for all layers (appended last, so it takes precedence)
config["quant_cfg"].append({"quantizer_path": "*input_quantizer", "enable": False})

# Disable all quantizers for layers matching the pattern "layer1.*"
config["quant_cfg"]["*layer1.*"] = {"enable": False}
config["quant_cfg"].append({"quantizer_path": "*layer1.*", "enable": False})

Advanced Configuration Creation
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Expand All @@ -253,18 +255,23 @@ For exploring new quantization recipes, you can compose a completely new configu

.. code-block:: python

from modelopt.torch.quantization.config import _default_disabled_quantizer_cfg

# Custom configuration for INT4 block-wise weights and INT8 dynamic activations
MY_CUSTOM_CONFIG = {
"quant_cfg": {
"quant_cfg": [
# Disable all quantizers by default, then enable selectively
{"quantizer_path": "*", "enable": False},

# Configure weight quantizers with 4-bit precision and 128-element blocks
"*weight_quantizer": {"num_bits": 4, "block_sizes": {-1: 128}, "enable": True},
{"quantizer_path": "*weight_quantizer", "cfg": {"num_bits": 4, "block_sizes": {-1: 128}}, "enable": True},

# Configure input quantizers with 8-bit dynamic quantization
"*input_quantizer": {"num_bits": 8, "type": "dynamic", "block_sizes": {-1: None}},
{"quantizer_path": "*input_quantizer", "cfg": {"num_bits": 8, "type": "dynamic", "block_sizes": {-1: None}}},

# Include default disabled quantizer configurations
**_default_disabled_quantizer_cfg,
},
*_default_disabled_quantizer_cfg,
],
"algorithm": "max",
}

Expand Down Expand Up @@ -394,8 +401,10 @@ You can specify ``custom_calib`` as ``algorithm`` in ``quant_cfg`` to use it. He

# create quantization configuration with "custom_calib" method
quant_cfg = {
'quant_cfg': {'*weight_quantizer': ..},
'algorithm': {"method": 'custom_calib'},
'quant_cfg': [
{"quantizer_path": "*weight_quantizer", "cfg": {...}},
],
'algorithm': {"method": 'custom_calib'},
}


Expand Down
Loading
Loading