Skip to content

[fix](fe) Reject multi-column NGRAM_BF indexes#64343

Open
airborne12 wants to merge 1 commit into
apache:masterfrom
airborne12:doris-19296-ngbf-multicol
Open

[fix](fe) Reject multi-column NGRAM_BF indexes#64343
airborne12 wants to merge 1 commit into
apache:masterfrom
airborne12:doris-19296-ngbf-multicol

Conversation

@airborne12

Copy link
Copy Markdown
Member

What problem does this PR solve?

Issue Number: DORIS-19296

Related PR: None

Problem Summary:

Creating an NGRAM_BF index with multiple columns passed FE validation and could reach BE tablet creation, where tablet metadata expects each NGRAM_BF index to bind exactly one column. This rejects invalid multi-column NGRAM_BF definitions during FE analysis for both inline table indexes and CREATE INDEX.

Release note

Reject invalid multi-column NGRAM_BF index definitions during DDL analysis.

Check List (For Author)

  • Test

    • Regression test
      • Added coverage in regression-test/suites/index_p0/test_ngram_bloomfilter_index.groovy for inline table index and CREATE INDEX paths. Not run locally because no worktree Doris cluster was started.
    • Unit Test
      • ./run-fe-ut.sh --run org.apache.doris.nereids.trees.plans.commands.IndexDefinitionTest
    • Manual test (add detailed scripts or steps below)
      • ./build.sh --fe
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes. Invalid multi-column NGRAM_BF index definitions now fail during FE analysis.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

### What problem does this PR solve?

Issue Number: DORIS-19296

Problem Summary: Creating an NGRAM_BF index with multiple columns passed FE validation and could reach BE tablet creation, where tablet metadata expects each NGRAM_BF index to bind exactly one column. This rejects invalid DDL during FE analysis for both inline table indexes and CREATE INDEX.

### Release note

Reject invalid multi-column NGRAM_BF index definitions during DDL analysis.

### Check List (For Author)

- Test: Unit Test / Build

    - ./run-fe-ut.sh --run org.apache.doris.nereids.trees.plans.commands.IndexDefinitionTest

    - ./build.sh --fe

    - Added regression coverage under index_p0; not run locally because no worktree Doris cluster was started.

- Behavior changed: Yes. Invalid multi-column NGRAM_BF index definitions now fail in FE analysis.

- Does this need documentation: No
@hello-stephen

Copy link
Copy Markdown
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@airborne12

Copy link
Copy Markdown
Member Author

run buildall

@hello-stephen

Copy link
Copy Markdown
Contributor

FE UT Coverage Report

Increment line coverage 0.00% (0/1) 🎉
Increment coverage report
Complete coverage report

@hello-stephen

Copy link
Copy Markdown
Contributor
TPC-H: Total hot run time: 29324 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit dff748278a12a98bd4df3f0486b489e1a4a88671, data reload: false

------ Round 1 ----------------------------------
orders	Doris	NULL	NULL	0	0	0	NULL	0	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	17643	4057	4230	4057
q2	q3	10778	1435	815	815
q4	4687	507	349	349
q5	7559	897	580	580
q6	185	175	136	136
q7	772	852	631	631
q8	9406	1540	1580	1540
q9	5845	4495	4415	4415
q10	6775	1813	1535	1535
q11	433	275	249	249
q12	633	426	291	291
q13	18206	3381	2721	2721
q14	257	256	247	247
q15	q16	825	767	708	708
q17	958	978	1010	978
q18	6906	5736	5575	5575
q19	1376	1287	1083	1083
q20	505	415	263	263
q21	6244	2801	2793	2793
q22	461	372	358	358
Total cold run time: 100454 ms
Total hot run time: 29324 ms

----- Round 2, with runtime_filter_mode=off -----
orders	Doris	NULL	NULL	150000000	42	6422171781	NULL	22778155	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	5136	4691	4857	4691
q2	q3	4843	5352	4692	4692
q4	2092	2178	1370	1370
q5	4769	4871	4603	4603
q6	232	177	123	123
q7	1909	1896	1598	1598
q8	2368	2070	2071	2070
q9	7882	7583	7370	7370
q10	4742	4700	4224	4224
q11	535	385	351	351
q12	721	740	526	526
q13	2994	3430	2764	2764
q14	277	281	249	249
q15	q16	678	696	605	605
q17	1275	1247	1249	1247
q18	7186	6884	6816	6816
q19	1124	1094	1122	1094
q20	2222	2219	1951	1951
q21	5258	4558	4378	4378
q22	524	448	424	424
Total cold run time: 56767 ms
Total hot run time: 51146 ms

@hello-stephen

Copy link
Copy Markdown
Contributor
TPC-DS: Total hot run time: 169439 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit dff748278a12a98bd4df3f0486b489e1a4a88671, data reload: false

query5	4306	638	479	479
query6	456	197	179	179
query7	4848	543	303	303
query8	364	211	204	204
query9	8776	4090	4056	4056
query10	443	304	262	262
query11	5936	2330	2205	2205
query12	155	107	106	106
query13	1252	614	419	419
query14	6440	5407	5095	5095
query14_1	4408	4429	4447	4429
query15	207	197	179	179
query16	1047	468	377	377
query17	1182	730	604	604
query18	2758	483	363	363
query19	225	187	151	151
query20	117	110	104	104
query21	231	140	121	121
query22	13687	13655	13468	13468
query23	17373	16627	16149	16149
query23_1	16272	16274	16306	16274
query24	7466	1758	1335	1335
query24_1	1342	1316	1316	1316
query25	587	480	402	402
query26	1319	321	169	169
query27	2565	554	340	340
query28	4384	2070	2047	2047
query29	1115	645	504	504
query30	313	236	201	201
query31	1125	1087	968	968
query32	106	64	64	64
query33	601	331	306	306
query34	1179	1154	643	643
query35	743	794	679	679
query36	1382	1344	1245	1245
query37	163	108	91	91
query38	3216	3183	3089	3089
query39	928	924	897	897
query39_1	879	882	877	877
query40	235	120	98	98
query41	65	62	61	61
query42	93	93	92	92
query43	313	313	280	280
query44	
query45	191	186	177	177
query46	1076	1187	768	768
query47	2312	2365	2210	2210
query48	407	408	296	296
query49	630	482	365	365
query50	999	379	261	261
query51	4373	4415	4276	4276
query52	89	89	77	77
query53	249	280	194	194
query54	271	215	194	194
query55	83	75	70	70
query56	249	219	223	219
query57	1436	1412	1332	1332
query58	249	224	213	213
query59	1584	1666	1466	1466
query60	295	247	221	221
query61	161	162	149	149
query62	706	651	582	582
query63	234	185	191	185
query64	2516	780	668	668
query65	
query66	1738	455	356	356
query67	29800	29823	29579	29579
query68	
query69	453	294	272	272
query70	997	939	940	939
query71	299	221	214	214
query72	3064	2727	2426	2426
query73	835	766	449	449
query74	5116	4970	4787	4787
query75	2668	2599	2242	2242
query76	2369	1164	752	752
query77	353	366	292	292
query78	12453	12613	11828	11828
query79	1375	1118	755	755
query80	600	478	408	408
query81	452	277	243	243
query82	586	162	123	123
query83	352	274	255	255
query84	
query85	884	540	444	444
query86	362	320	274	274
query87	3398	3342	3218	3218
query88	3636	2745	2716	2716
query89	418	381	342	342
query90	2035	184	183	183
query91	193	166	141	141
query92	64	61	58	58
query93	1479	1474	848	848
query94	536	372	299	299
query95	681	393	342	342
query96	1098	810	320	320
query97	2702	2736	2543	2543
query98	212	208	205	205
query99	1148	1201	1052	1052
Total cold run time: 251372 ms
Total hot run time: 169439 ms

@hello-stephen

Copy link
Copy Markdown
Contributor

FE Regression Coverage Report

Increment line coverage 0.00% (0/8) 🎉
Increment coverage report
Complete coverage report

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants