Skip to content

[fix](be) Return NaN for avg_weighted when sum of weights is zero#64333

Open
jacktengg wants to merge 1 commit into
apache:masterfrom
jacktengg:fix-rqg-avg-weighted
Open

[fix](be) Return NaN for avg_weighted when sum of weights is zero#64333
jacktengg wants to merge 1 commit into
apache:masterfrom
jacktengg:fix-rqg-avg-weighted

Conversation

@jacktengg

@jacktengg jacktengg commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

Issue Number: close #xxx

Problem Summary: When the sum of weights passed to avg_weighted is zero but the weighted data sum is non-zero, the function computed data_sum / weight_sum and returned +/-Infinity (e.g. avg_weighted(f1, f2) over rows (1, 1), (2, -1) returned -Infinity). The expected mathematical result of a division by zero weight is undefined, so it should return NaN.

This fixes AggregateFunctionAvgWeightedData::get() to return quiet_NaN() when weight_sum is exactly zero, instead of producing Infinity.

What problem does this PR solve?

Issue Number: close #xxx

Related PR: #xxx

Problem Summary:

Release note

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

Issue Number: close #xxx

Problem Summary: When the sum of weights passed to avg_weighted is zero
but the weighted data sum is non-zero, the function computed data_sum /
weight_sum and returned +/-Infinity (e.g. avg_weighted(f1, f2) over rows
(1, 1), (2, -1) returned -Infinity). The expected mathematical result of
a division by zero weight is undefined, so it should return NaN.

This fixes AggregateFunctionAvgWeightedData::get() to return
quiet_NaN() when weight_sum is exactly zero, instead of producing
Infinity.
@hello-stephen

Copy link
Copy Markdown
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@jacktengg

Copy link
Copy Markdown
Contributor Author

/review

@jacktengg

Copy link
Copy Markdown
Contributor Author

run buildall

@github-actions github-actions Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review result: no blocking code issues found.

Critical checkpoint conclusions:

  • Goal and proof: the PR changes avg_weighted finalization so a non-empty aggregate state with accumulated weight_sum == 0 returns quiet NaN instead of +/-Infinity. The added regression case covers weights 1 and -1 and expects NaN.
  • Scope: the implementation is small and focused: one include, one final-result branch, and one regression case.
  • Concurrency and lifecycle: not applicable. This only reads aggregate state during result insertion; no shared state, locks, static initialization dependency, or lifecycle ownership is introduced.
  • Config and compatibility: no config, storage format, function signature, serialized aggregate state, or FE/BE protocol change. Existing serialized state remains data_sum + weight_sum.
  • Parallel paths: I checked the registration/combinator path. avg_weighted and foreach wrappers use this same nested implementation, so I did not find another result path that still returns infinity for the same zero-weight state.
  • Conditions and error handling: the new weight_sum == 0 condition has a concrete semantic trigger and does not suppress errors or ignore Status values.
  • Test coverage and result correctness: the added regression follows the existing suite style, drops the table before use, uses a hardcoded table name, and the expected NaN spelling matches existing floating-point output conventions. git diff --check is clean.
  • Observability and performance: no new observability is needed for this finalization-only behavior. The extra branch is outside accumulation/merge loops and should be negligible.
  • Transactions, persistence, data writes, and FE-BE variables: not applicable.
  • User focus: no additional user-provided focus points were present.

I did not run a local Doris regression cluster. Current CI has pending BE/performance checks; the macOS BE UT failure I inspected stops before build with "ERROR: The JAVA version is 25, it must be JDK-17", which appears unrelated to this change.

Minor process note: the PR body still says Release note: None even though this is a user-visible bug fix. I am not blocking on that, but it may be worth aligning before merge.

@hello-stephen

Copy link
Copy Markdown
Contributor
TPC-H: Total hot run time: 29181 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit cd4acb1d0d0d40d1085ed15fa47c02b0318a1aff, data reload: false

------ Round 1 ----------------------------------
orders	Doris	NULL	NULL	0	0	0	NULL	0	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	17609	4042	4148	4042
q2	q3	10736	1375	821	821
q4	4694	478	348	348
q5	7612	881	601	601
q6	189	181	144	144
q7	783	873	632	632
q8	9420	1479	1589	1479
q9	5938	4546	4517	4517
q10	6766	1780	1552	1552
q11	448	277	251	251
q12	650	447	289	289
q13	18181	3461	2787	2787
q14	265	257	245	245
q15	q16	821	799	713	713
q17	957	837	882	837
q18	6946	5793	5530	5530
q19	1401	1239	1124	1124
q20	506	388	256	256
q21	6488	2850	2703	2703
q22	463	384	310	310
Total cold run time: 100873 ms
Total hot run time: 29181 ms

----- Round 2, with runtime_filter_mode=off -----
orders	Doris	NULL	NULL	150000000	42	6422171781	NULL	22778155	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	5045	4807	4755	4755
q2	q3	4892	5269	4657	4657
q4	2113	2157	1387	1387
q5	4787	4863	4644	4644
q6	227	173	130	130
q7	1822	1842	1615	1615
q8	2424	2083	2081	2081
q9	7786	7641	7332	7332
q10	4736	4645	4199	4199
q11	535	390	353	353
q12	732	736	527	527
q13	3024	3404	2793	2793
q14	282	278	243	243
q15	q16	677	690	615	615
q17	1293	1269	1258	1258
q18	7098	6815	6769	6769
q19	1180	1114	1094	1094
q20	2220	2216	1931	1931
q21	5241	4648	4559	4559
q22	510	445	427	427
Total cold run time: 56624 ms
Total hot run time: 51369 ms

@hello-stephen

Copy link
Copy Markdown
Contributor
TPC-DS: Total hot run time: 169314 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit cd4acb1d0d0d40d1085ed15fa47c02b0318a1aff, data reload: false

query5	4337	636	478	478
query6	476	206	216	206
query7	4865	534	303	303
query8	360	218	203	203
query9	8784	4030	4022	4022
query10	450	344	253	253
query11	5877	2307	2186	2186
query12	159	102	96	96
query13	1240	611	418	418
query14	6345	5371	5042	5042
query14_1	4356	4365	4393	4365
query15	203	199	172	172
query16	1055	451	443	443
query17	1114	710	592	592
query18	2489	484	364	364
query19	206	192	148	148
query20	128	112	108	108
query21	217	141	120	120
query22	13669	13616	13447	13447
query23	17470	16556	16263	16263
query23_1	16369	16303	16261	16261
query24	7695	1776	1334	1334
query24_1	1311	1313	1328	1313
query25	583	474	410	410
query26	1321	351	170	170
query27	2623	536	337	337
query28	4477	2028	2035	2028
query29	1113	628	507	507
query30	324	227	206	206
query31	1142	1077	958	958
query32	110	62	61	61
query33	555	327	271	271
query34	1194	1127	656	656
query35	769	782	696	696
query36	1394	1429	1244	1244
query37	158	108	93	93
query38	3229	3176	3062	3062
query39	984	931	915	915
query39_1	877	875	877	875
query40	256	128	110	110
query41	72	70	69	69
query42	100	97	97	97
query43	329	323	277	277
query44	
query45	199	190	186	186
query46	1087	1239	748	748
query47	2373	2396	2251	2251
query48	383	435	295	295
query49	655	501	379	379
query50	940	352	261	261
query51	4291	4330	4222	4222
query52	90	91	80	80
query53	254	281	202	202
query54	288	238	234	234
query55	82	78	73	73
query56	263	251	240	240
query57	1489	1398	1303	1303
query58	233	220	207	207
query59	1542	1647	1434	1434
query60	291	246	239	239
query61	164	151	159	151
query62	714	645	583	583
query63	229	194	185	185
query64	2568	808	618	618
query65	
query66	1789	452	357	357
query67	29578	29691	29538	29538
query68	
query69	436	315	264	264
query70	976	940	948	940
query71	312	219	217	217
query72	2998	2736	2455	2455
query73	867	764	428	428
query74	5130	4963	4791	4791
query75	2656	2576	2231	2231
query76	2309	1167	767	767
query77	355	367	288	288
query78	12385	12389	11863	11863
query79	1418	1110	782	782
query80	873	474	403	403
query81	499	283	240	240
query82	570	157	125	125
query83	343	280	253	253
query84	
query85	925	533	429	429
query86	399	300	255	255
query87	3380	3361	3165	3165
query88	3619	2741	2757	2741
query89	430	386	328	328
query90	1756	177	182	177
query91	205	160	138	138
query92	64	63	59	59
query93	1536	1478	864	864
query94	619	347	335	335
query95	671	487	348	348
query96	1039	770	364	364
query97	2690	2689	2560	2560
query98	210	208	198	198
query99	1147	1178	1035	1035
Total cold run time: 251476 ms
Total hot run time: 169314 ms

@hello-stephen

Copy link
Copy Markdown
Contributor

BE UT Coverage Report

Increment line coverage 0.00% (0/6) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 53.88% (21136/39226)
Line Coverage 37.59% (201255/535383)
Region Coverage 33.59% (157688/469380)
Branch Coverage 34.66% (69091/199332)

@hello-stephen

Copy link
Copy Markdown
Contributor

BE Regression && UT Coverage Report

Increment line coverage 100.00% (6/6) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 73.76% (28259/38314)
Line Coverage 57.78% (307535/532278)
Region Coverage 54.51% (257032/471564)
Branch Coverage 55.95% (111674/199603)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants