Remove zero point parameter for dequantizelinear when its zero #3531

pfultz2 · 2024-10-15T20:30:28Z

This removes the zero point from dequantizelinear when its zero. We dont do this for quantizelinear since its necessary for deducing the output type.

lakhinderwalia · 2024-10-15T22:37:27Z

test/simplify_qdq_test.cpp

@@ -162,7 +165,7 @@ TEST_CASE(qdq_different_scales)
        auto t2     = m1.add_parameter("t2", sh2);
        auto scale1 = m1.add_literal(0.5f);
        auto scale2 = m1.add_literal(0.4f);
-        auto zero   = m1.add_literal(std::int8_t{0});
+        auto zero   = m1.add_literal(std::int8_t{1});


This is no longer a variable called zero. It is initialized to a 1. Maybe it should be called zp? Also, are you changing its value so it doesn't get eliminated?

lakhinderwalia · 2024-10-15T22:38:50Z

src/simplify_qdq.cpp

+{
+    for(auto ins : iterator_for(m))
+    {
+        if(ins->name() != "dequantizelinear")


For case of quantizelinear, could this not replace a bunch of 0s with a scalar broadcast..? Thanks.

TedThemistokleous · 2024-10-21T17:31:22Z

src/simplify_qdq.cpp

+        auto a       = zp->eval();
+        bool is_zero = false;
+        a.visit([&](auto t) {
+            is_zero = std::all_of(t.begin(), t.end(), [](auto x) { return float_equal(x, 0); });


Isn't zero point based on the input type here for dequantize? If so why are we using float_equal then? Is this more to cover the case of say fp8 as well as int8, int4, etc?

The visitor will visit all data types, so we need the float_equal if the zero point is a floating point(which is the case for fp8).

TedThemistokleous

One question, otherwise I get what you're doing here

TedThemistokleous · 2024-10-21T18:01:32Z

Fix CI but I think this is fine with what you're doing here

migraphx-bot · 2024-10-23T10:42:57Z

Test	Batch	Rate new 427fee	Rate old b73def	Diff	Compare
torchvision-resnet50	64	3,260.08	3,257.93	0.07%	✅
torchvision-resnet50_fp16	64	6,991.66	6,992.99	-0.02%	✅
torchvision-densenet121	32	2,438.54	2,432.26	0.26%	✅
torchvision-densenet121_fp16	32	4,058.84	4,038.39	0.51%	✅
torchvision-inceptionv3	32	1,637.48	1,638.89	-0.09%	✅
torchvision-inceptionv3_fp16	32	2,763.91	2,761.69	0.08%	✅
cadene-inceptionv4	16	776.01	776.39	-0.05%	✅
cadene-resnext64x4	16	810.74	811.37	-0.08%	✅
slim-mobilenet	64	7,538.67	7,532.73	0.08%	✅
slim-nasnetalarge	64	211.54	211.42	0.06%	✅
slim-resnet50v2	64	3,504.60	3,507.25	-0.08%	✅
bert-mrpc-onnx	8	1,148.10	1,147.76	0.03%	✅
bert-mrpc-tf	1	462.55	469.91	-1.57%	✅
pytorch-examples-wlang-gru	1	421.08	514.96	-18.23%	🔴
pytorch-examples-wlang-lstm	1	385.54	386.61	-0.28%	✅
torchvision-resnet50_1	1	805.56	772.05	4.34%	🔆
cadene-dpn92_1	1	396.22	398.73	-0.63%	✅
cadene-resnext101_1	1	384.50	383.67	0.22%	✅
onnx-taau-downsample	1	342.75	342.33	0.12%	✅
dlrm-criteoterabyte	1	33.45	33.33	0.36%	✅
dlrm-criteoterabyte_fp16	1	52.74	52.70	0.06%	✅
agentmodel	1	8,499.64	8,056.20	5.50%	🔆
unet_fp16	2	58.86	58.92	-0.10%	✅
resnet50v1_fp16	1	944.31	950.32	-0.63%	✅
resnet50v1_int8	1	1,003.77	1,000.02	0.37%	✅
bert_base_cased_fp16	64	1,170.09	1,169.24	0.07%	✅
bert_large_uncased_fp16	32	363.92	363.69	0.06%	✅
bert_large_fp16	1	198.80	198.89	-0.04%	✅
distilgpt2_fp16	16	2,206.09	2,203.09	0.14%	✅
yolov5s	1	532.52	540.85	-1.54%	✅
tinyllama	1	43.72	43.43	0.65%	✅
vicuna-fastchat	1	181.32	170.64	6.26%	🔆
whisper-tiny-encoder	1	419.46	418.21	0.30%	✅
whisper-tiny-decoder	1	427.68	426.10	0.37%	✅

This build is not recommended to merge 🔴

migraphx-bot · 2024-10-23T10:42:58Z

✅ bert-mrpc-onnx: PASSED: MIGraphX meets tolerance

✅ bert-mrpc-tf: PASSED: MIGraphX meets tolerance

✅ pytorch-examples-wlang-gru: PASSED: MIGraphX meets tolerance

✅ pytorch-examples-wlang-lstm: PASSED: MIGraphX meets tolerance

✅ torchvision-resnet50_1: PASSED: MIGraphX meets tolerance

✅ cadene-dpn92_1: PASSED: MIGraphX meets tolerance

✅ cadene-resnext101_1: PASSED: MIGraphX meets tolerance

✅ dlrm-criteoterabyte: PASSED: MIGraphX meets tolerance

✅ agentmodel: PASSED: MIGraphX meets tolerance

✅ unet: PASSED: MIGraphX meets tolerance

✅ resnet50v1: PASSED: MIGraphX meets tolerance

✅ bert_base_cased_fp16: PASSED: MIGraphX meets tolerance

🔴bert_large_uncased_fp16: FAILED: MIGraphX is not within tolerance - check verbose output

✅ bert_large: PASSED: MIGraphX meets tolerance

✅ yolov5s: PASSED: MIGraphX meets tolerance

✅ tinyllama: PASSED: MIGraphX meets tolerance

✅ vicuna-fastchat: PASSED: MIGraphX meets tolerance

✅ whisper-tiny-encoder: PASSED: MIGraphX meets tolerance

✅ whisper-tiny-decoder: PASSED: MIGraphX meets tolerance

✅ distilgpt2_fp16: PASSED: MIGraphX meets tolerance

Remove zero zero points for dequantizelinear

634128c

pfultz2 requested a review from causten as a code owner October 15, 2024 20:30

pfultz2 requested review from lakhinderwalia, shivadbhavsar and TedThemistokleous and removed request for causten October 15, 2024 20:30

Format

201ff99

lakhinderwalia reviewed Oct 15, 2024

View reviewed changes

lakhinderwalia mentioned this pull request Oct 16, 2024

[INT4] Compress model by quantizing weights to int4 #3307

Open

18 tasks

causten assigned pfultz2 Oct 16, 2024

TedThemistokleous reviewed Oct 21, 2024

View reviewed changes

TedThemistokleous requested changes Oct 21, 2024

View reviewed changes

TedThemistokleous approved these changes Oct 21, 2024

View reviewed changes

TedThemistokleous added the Perf Improve label Oct 21, 2024

Merge branch 'develop' into dq-zero-zp

427fee7

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove zero point parameter for dequantizelinear when its zero #3531

Remove zero point parameter for dequantizelinear when its zero #3531

pfultz2 commented Oct 15, 2024

lakhinderwalia Oct 15, 2024

lakhinderwalia Oct 15, 2024

TedThemistokleous Oct 21, 2024

pfultz2 Oct 21, 2024

TedThemistokleous left a comment

TedThemistokleous commented Oct 21, 2024

migraphx-bot commented Oct 23, 2024

migraphx-bot commented Oct 23, 2024

Remove zero point parameter for dequantizelinear when its zero #3531

Are you sure you want to change the base?

Remove zero point parameter for dequantizelinear when its zero #3531

Conversation

pfultz2 commented Oct 15, 2024

lakhinderwalia Oct 15, 2024

Choose a reason for hiding this comment

lakhinderwalia Oct 15, 2024

Choose a reason for hiding this comment

TedThemistokleous Oct 21, 2024

Choose a reason for hiding this comment

pfultz2 Oct 21, 2024

Choose a reason for hiding this comment

TedThemistokleous left a comment

Choose a reason for hiding this comment

TedThemistokleous commented Oct 21, 2024

migraphx-bot commented Oct 23, 2024

migraphx-bot commented Oct 23, 2024