[FR]: save model from pass OnnxStaticQuantization quant_preprocess = true #1632

xieofxie · 2025-02-21T02:43:25Z

Proposal Summary

do we have a way to this? Sometimes I need to quantize from the model here and preprocess is not needed to be reruned multiple times.

We could do

Add a new pass to run this only
Add a parameter to OnnxStaticQuantization so only save the mode from this and no quantize

What component(s) does this request affect?

jambayk · 2025-02-21T19:55:15Z

This is a good idea. Having to rerun this multiple times was a concern for us so we tried to do some caching with

Olive/olive/passes/onnx/quantization.py

Line 395 in 51d2c8a

preprocessed_temp_model_path = (

but that only persists during the lifetime of the pass and was meant for use with search.

I think for a general use case, we could create a new pass to do quant preprocess. Providing a directory to save the preprocessed model is a possibility too but I am not sure if we want the user to have to manage that.

xieofxie · 2025-02-24T01:45:03Z

Yes, a new pass is very useful. Because many parameters of OnnxStaticQuantization are based on the preprocessed model not the input model of the pass

xieofxie added the enhancement New feature or request label Feb 21, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FR]: save model from pass OnnxStaticQuantization quant_preprocess = true #1632

[FR]: save model from pass OnnxStaticQuantization quant_preprocess = true #1632

xieofxie commented Feb 21, 2025

jambayk commented Feb 21, 2025

xieofxie commented Feb 24, 2025

[FR]: save model from pass OnnxStaticQuantization quant_preprocess = true #1632

[FR]: save model from pass OnnxStaticQuantization quant_preprocess = true #1632

Comments

xieofxie commented Feb 21, 2025

Proposal Summary

What component(s) does this request affect?

jambayk commented Feb 21, 2025

xieofxie commented Feb 24, 2025