Skip to content

[fix](simplify agg) SimplifyAggGroupBy should verify injectivity#64335

Open
yujun777 wants to merge 1 commit into
apache:masterfrom
yujun777:fix-simplify-agg-group-by
Open

[fix](simplify agg) SimplifyAggGroupBy should verify injectivity#64335
yujun777 wants to merge 1 commit into
apache:masterfrom
yujun777:fix-simplify-agg-group-by

Conversation

@yujun777

Copy link
Copy Markdown
Contributor

Problem

SimplifyAggGroupBy simplified GROUP BY f(x) to GROUP BY x without
verifying that f(x) is injective (one-to-one). This caused wrong results:

Expression Why wrong
a * 0 / 0 * a always evaluates to 0 — all rows fall into one group
0 / a always evaluates to 0
a / 0 division by zero
a + NULL / a * NULL / ... always evaluates to NULL
a * 0.1 with float/double precision loss may map different inputs to same result

Fix

  1. isBinaryArithmeticSlot: restructured to separate slot-expr from literal,
    then validate each independently. Float/double check runs early, before
    slot extraction.

  2. New checkLiteral(expr, literal): rejects NULL literal and
    Multiply/Divide by zero.

  3. New canExtractSlot(expr): replaces the old unconditional
    extractSlotOrCastOnSlot — only accepts bare Slot or implicit lossless
    widening casts (integral→integral, float→double, integral→decimal,
    decimal→decimal). Range and scale are compared directly for correctness.

  4. New isLosslessWidening(src, tgt): type-family-aware widening check
    using width() for integral/float, and DecimalV3Type.forType() for
    decimal range/scale comparison.

Changes

  • SimplifyAggGroupBy.java: +80 lines, rewritten core logic
  • ExpressionUtils.java: -35 lines, removed unused isSlotOrCastOnSlot /
    extractSlotOrCastOnSlot
  • SimplifyAggGroupByTest.java: +216 lines, 25 tests covering all new paths

The rule simplified GROUP BY f(x) to GROUP BY x without verifying
that f(x) is injective (one-to-one). This caused wrong results when:
  - literal is NULL (any op: a+NULL → always NULL)
  - Multiply/Divide by zero (a*0 → always 0)
  - Multiply/Divide with float/double operands (precision loss)

Also adds proper handling of implicit lossless widening casts
(integral→integral, float→double, integral→decimal, decimal→decimal)
and removes the now-unused ExpressionUtils.extractSlotOrCastOnSlot.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@hello-stephen

Copy link
Copy Markdown
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@yujun777

Copy link
Copy Markdown
Contributor Author

run buildall

1 similar comment
@yujun777

Copy link
Copy Markdown
Contributor Author

run buildall

@hello-stephen

Copy link
Copy Markdown
Contributor
TPC-H: Total hot run time: 29419 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 0325eaa0a5f0cc1d1259a3b2506c366bcf660300, data reload: false

------ Round 1 ----------------------------------
orders	Doris	NULL	NULL	0	0	0	NULL	0	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	17633	4149	4133	4133
q2	q3	10760	1420	819	819
q4	4685	471	342	342
q5	7542	886	599	599
q6	187	171	136	136
q7	773	872	632	632
q8	9349	1629	1681	1629
q9	5943	4535	4481	4481
q10	6759	1786	1504	1504
q11	436	263	248	248
q12	632	421	289	289
q13	18105	3364	2732	2732
q14	263	264	239	239
q15	q16	820	776	709	709
q17	923	1002	975	975
q18	6897	5661	5619	5619
q19	1329	1334	991	991
q20	514	402	259	259
q21	6313	2844	2751	2751
q22	470	374	332	332
Total cold run time: 100333 ms
Total hot run time: 29419 ms

----- Round 2, with runtime_filter_mode=off -----
orders	Doris	NULL	NULL	150000000	42	6422171781	NULL	22778155	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	5051	5063	5021	5021
q2	q3	4868	5281	4722	4722
q4	2104	2232	1403	1403
q5	4938	4655	4668	4655
q6	241	181	127	127
q7	1970	1704	1566	1566
q8	2438	2189	2195	2189
q9	7888	7422	7365	7365
q10	4769	4671	4191	4191
q11	527	382	350	350
q12	732	739	522	522
q13	2951	3398	2760	2760
q14	280	281	262	262
q15	q16	668	701	610	610
q17	1281	1261	1259	1259
q18	7110	6882	6894	6882
q19	1087	1118	1060	1060
q20	2226	2196	1937	1937
q21	5301	4558	4441	4441
q22	503	454	416	416
Total cold run time: 56933 ms
Total hot run time: 51738 ms

@hello-stephen

Copy link
Copy Markdown
Contributor
TPC-DS: Total hot run time: 169732 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 0325eaa0a5f0cc1d1259a3b2506c366bcf660300, data reload: false

query5	4309	636	477	477
query6	453	192	176	176
query7	4891	583	303	303
query8	359	234	200	200
query9	8764	3977	4001	3977
query10	458	303	264	264
query11	5926	2327	2138	2138
query12	178	105	100	100
query13	1249	601	437	437
query14	6388	5408	5104	5104
query14_1	4395	4359	4367	4359
query15	203	202	176	176
query16	995	468	448	448
query17	1154	731	596	596
query18	2453	486	362	362
query19	208	187	147	147
query20	116	108	107	107
query21	221	137	120	120
query22	13603	13566	13503	13503
query23	17324	16513	16195	16195
query23_1	16237	16399	16356	16356
query24	7521	1769	1313	1313
query24_1	1316	1311	1290	1290
query25	585	471	419	419
query26	1299	318	179	179
query27	2669	587	364	364
query28	4458	2012	2056	2012
query29	1136	642	510	510
query30	317	246	202	202
query31	1113	1083	954	954
query32	107	63	61	61
query33	557	302	250	250
query34	1196	1151	654	654
query35	742	794	666	666
query36	1405	1386	1270	1270
query37	155	106	91	91
query38	3192	3154	3048	3048
query39	928	962	904	904
query39_1	904	867	869	867
query40	218	127	101	101
query41	64	60	60	60
query42	100	95	95	95
query43	318	323	284	284
query44	
query45	195	186	182	182
query46	1106	1161	766	766
query47	2367	2306	2298	2298
query48	403	432	287	287
query49	623	488	349	349
query50	1044	357	273	273
query51	4399	4265	4287	4265
query52	89	90	79	79
query53	251	267	205	205
query54	288	216	193	193
query55	78	77	71	71
query56	242	245	223	223
query57	1445	1410	1351	1351
query58	256	222	213	213
query59	1565	1633	1397	1397
query60	289	246	230	230
query61	167	158	158	158
query62	704	662	604	604
query63	239	186	187	186
query64	2564	768	662	662
query65	
query66	1782	467	342	342
query67	29740	29687	29521	29521
query68	
query69	416	299	268	268
query70	978	942	962	942
query71	293	226	196	196
query72	2892	2755	2408	2408
query73	848	771	437	437
query74	5126	4994	4786	4786
query75	2635	2572	2260	2260
query76	2300	1155	777	777
query77	354	370	289	289
query78	12323	12355	11816	11816
query79	1383	1082	758	758
query80	580	458	411	411
query81	456	284	260	260
query82	573	152	117	117
query83	356	277	247	247
query84	
query85	869	524	439	439
query86	363	294	290	290
query87	3386	3384	3197	3197
query88	3628	2733	2734	2733
query89	418	395	345	345
query90	2047	178	181	178
query91	174	167	138	138
query92	66	64	57	57
query93	1426	1448	990	990
query94	559	347	326	326
query95	690	488	329	329
query96	1013	854	343	343
query97	2725	2696	2561	2561
query98	218	213	202	202
query99	1142	1198	1035	1035
Total cold run time: 250501 ms
Total hot run time: 169732 ms

@hello-stephen

Copy link
Copy Markdown
Contributor
TPC-H: Total hot run time: 28739 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 0325eaa0a5f0cc1d1259a3b2506c366bcf660300, data reload: false

------ Round 1 ----------------------------------
orders	Doris	NULL	NULL	0	0	0	NULL	0	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	17606	4079	4038	4038
q2	q3	10823	1364	820	820
q4	4699	482	338	338
q5	7584	874	578	578
q6	201	171	142	142
q7	795	835	646	646
q8	10195	1678	1688	1678
q9	6932	4537	4465	4465
q10	6752	1815	1511	1511
q11	435	275	245	245
q12	635	430	303	303
q13	18141	3421	2801	2801
q14	299	258	246	246
q15	q16	820	771	711	711
q17	3296	1095	635	635
q18	6736	5829	5516	5516
q19	1721	1259	1101	1101
q20	495	416	268	268
q21	5870	2565	2394	2394
q22	424	362	303	303
Total cold run time: 104459 ms
Total hot run time: 28739 ms

----- Round 2, with runtime_filter_mode=off -----
orders	Doris	NULL	NULL	150000000	42	6422171781	NULL	22778155	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	4348	4258	4266	4258
q2	q3	4502	4970	4297	4297
q4	2099	2197	1383	1383
q5	4424	4268	4288	4268
q6	235	176	129	129
q7	2311	1894	1683	1683
q8	2502	2122	2113	2113
q9	7925	7840	7828	7828
q10	4814	4750	4281	4281
q11	738	413	369	369
q12	744	747	563	563
q13	3281	3654	3051	3051
q14	311	318	287	287
q15	q16	715	741	653	653
q17	1353	1345	1331	1331
q18	7779	7318	6753	6753
q19	1128	1067	1112	1067
q20	2221	2223	1945	1945
q21	5246	4558	4418	4418
q22	533	452	417	417
Total cold run time: 57209 ms
Total hot run time: 51094 ms

@hello-stephen

Copy link
Copy Markdown
Contributor
TPC-DS: Total hot run time: 169104 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 0325eaa0a5f0cc1d1259a3b2506c366bcf660300, data reload: false

query5	4317	622	480	480
query6	447	210	187	187
query7	4862	529	296	296
query8	379	226	208	208
query9	8757	4004	3995	3995
query10	466	315	281	281
query11	5947	2377	2151	2151
query12	167	107	99	99
query13	1280	628	430	430
query14	6297	5407	5067	5067
query14_1	4407	4400	4403	4400
query15	203	202	179	179
query16	1017	469	450	450
query17	1136	733	607	607
query18	2547	486	361	361
query19	205	198	152	152
query20	124	110	110	110
query21	218	141	117	117
query22	13596	13540	13397	13397
query23	17321	16454	16157	16157
query23_1	16289	16389	16550	16389
query24	7510	1757	1350	1350
query24_1	1277	1309	1309	1309
query25	574	438	378	378
query26	1313	332	167	167
query27	2665	545	338	338
query28	4446	2026	1993	1993
query29	1067	588	480	480
query30	310	235	197	197
query31	1105	1072	955	955
query32	110	68	59	59
query33	504	318	249	249
query34	1186	1139	598	598
query35	756	770	683	683
query36	1409	1407	1279	1279
query37	149	106	92	92
query38	3187	3139	3053	3053
query39	929	920	902	902
query39_1	875	884	881	881
query40	221	121	104	104
query41	66	63	65	63
query42	95	93	93	93
query43	316	320	277	277
query44	
query45	196	184	177	177
query46	1084	1197	744	744
query47	2365	2346	2254	2254
query48	408	412	308	308
query49	642	475	350	350
query50	995	343	249	249
query51	4316	4319	4228	4228
query52	87	88	76	76
query53	251	271	192	192
query54	264	215	207	207
query55	79	80	73	73
query56	239	220	225	220
query57	1445	1411	1342	1342
query58	237	213	208	208
query59	1589	1605	1417	1417
query60	279	241	227	227
query61	154	160	160	160
query62	687	670	593	593
query63	239	192	183	183
query64	2546	798	665	665
query65	
query66	1819	472	360	360
query67	29725	29624	29578	29578
query68	
query69	457	302	278	278
query70	1021	957	935	935
query71	315	227	214	214
query72	3165	2655	2413	2413
query73	823	763	426	426
query74	5095	4942	4814	4814
query75	2648	2566	2228	2228
query76	2332	1145	758	758
query77	350	378	295	295
query78	12402	12559	11868	11868
query79	1490	1065	752	752
query80	1298	475	392	392
query81	530	280	243	243
query82	580	153	116	116
query83	323	271	256	256
query84	
query85	938	530	434	434
query86	422	297	269	269
query87	3403	3429	3212	3212
query88	3612	2728	2711	2711
query89	424	390	327	327
query90	1858	178	173	173
query91	175	162	135	135
query92	66	61	59	59
query93	1590	1402	835	835
query94	733	367	299	299
query95	658	377	371	371
query96	1074	762	321	321
query97	2695	2734	2563	2563
query98	216	209	201	201
query99	1144	1194	1027	1027
Total cold run time: 251983 ms
Total hot run time: 169104 ms

@hello-stephen

Copy link
Copy Markdown
Contributor

FE UT Coverage Report

Increment line coverage 92.11% (35/38) 🎉
Increment coverage report
Complete coverage report

1 similar comment
@hello-stephen

Copy link
Copy Markdown
Contributor

FE UT Coverage Report

Increment line coverage 92.11% (35/38) 🎉
Increment coverage report
Complete coverage report

}

@VisibleForTesting
protected static boolean isLosslessWidening(DataType src, DataType tgt) {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in #64080, DataType add a new interface isInjectiveCastTo, i think we could reuse it here

@hello-stephen

Copy link
Copy Markdown
Contributor

FE Regression Coverage Report

Increment line coverage 30.36% (17/56) 🎉
Increment coverage report
Complete coverage report

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants