Optimize splitPart scalar function to reduce allocation by justahuman1 · Pull Request #17660 · apache/pinot

justahuman1 · 2026-02-07T20:52:19Z

Optimized the splitPart function to use direct string traversal instead of allocating a full String[] array, significantly reducing memory pressure in high-throughput query scenarios.

Performance improvements:

Space complexity: O(n) → O(1)
Time complexity: O(n) for both implementations

Added JMH benchmarks to demonstrate improvements and regressions. The primary regression is in the backward index case so I added splitPartNegativeIdxSingleCharDelim (b4272e2) which increases complexity but makes the kernel much faster

Addresses #17362

JMH Results

Yeah, the common cases are more than 10x faster 🚀 . The original results showed a regression with negative_index, hence b4272e2. However, this code may be harder to manage so that is the trade-off. The below results are from all commits in the PR

Operation Type	New (ns/op)	Old (ns/op)	Speedup (Old/New)
small_index	777.8	38,722.2	≈ 49.8× faster
large_index	295,474.5	3,867,775.5	≈ 13.1× faster
negative_index	13.2	38,635.4	≈ 2,929× faster
large_negative_index	47,797.4	3,824,985.5	≈ 80× faster
adjacent_delimiters	779.9	3,864.4	≈ 5× faster

Benchmark                                       (_scenario)  Mode  Cnt           Score          Error  Units
BenchmarkSplitPart.testSplitPartNew             small_index  avgt   10         777.797 ±       22.025  ns/op
BenchmarkSplitPart.testSplitPartNew             large_index  avgt   10      295474.514 ±     2876.945  ns/op
BenchmarkSplitPart.testSplitPartNew          negative_index  avgt   10          13.198 ±        2.553  ns/op
BenchmarkSplitPart.testSplitPartNew    large_negative_index  avgt   10       47797.384 ±      465.119  ns/op
BenchmarkSplitPart.testSplitPartNew     adjacent_delimiters  avgt   10         779.944 ±       14.194  ns/op
BenchmarkSplitPart.testSplitPartNew             many_fields  avgt   10        1935.942 ±       59.342  ns/op
BenchmarkSplitPart.testSplitPartNew        multi_char_delim  avgt   10        4281.167 ±      102.780  ns/op
BenchmarkSplitPart.testSplitPartNew  large_multi_char_delim  avgt   10     1160607.923 ±    20648.116  ns/op
BenchmarkSplitPart.testSplitPartOld             small_index  avgt   10       38722.193 ±     1463.000  ns/op
BenchmarkSplitPart.testSplitPartOld             large_index  avgt   10     3867775.470 ±   107083.008  ns/op
BenchmarkSplitPart.testSplitPartOld          negative_index  avgt   10       38635.403 ±     1325.020  ns/op
BenchmarkSplitPart.testSplitPartOld    large_negative_index  avgt   10     3824985.465 ±    89641.937  ns/op
BenchmarkSplitPart.testSplitPartOld     adjacent_delimiters  avgt   10        3864.391 ±      101.146  ns/op
BenchmarkSplitPart.testSplitPartOld             many_fields  avgt   10     3814644.277 ±   132924.021  ns/op
BenchmarkSplitPart.testSplitPartOld        multi_char_delim  avgt   10      182955.787 ±     3254.768  ns/op
BenchmarkSplitPart.testSplitPartOld  large_multi_char_delim  avgt   10  1043425954.300 ± 29614093.332  ns/op

Allocation via `-prof gc`

     "org.apache.pinot.perf.BenchmarkSplitPart.testSplitPartNew:gc.alloc.rate.norm":  48.000 B/op
     "org.apache.pinot.perf.BenchmarkSplitPart.testSplitPartNew:gc.alloc.rate.norm":  48.001 B/op
     "org.apache.pinot.perf.BenchmarkSplitPart.testSplitPartNew:gc.alloc.rate.norm":  48.002 B/op
     "org.apache.pinot.perf.BenchmarkSplitPart.testSplitPartNew:gc.alloc.rate.norm":  48.007 B/op
     "org.apache.pinot.perf.BenchmarkSplitPart.testSplitPartNew:gc.alloc.rate.norm":  56.036 B/op
     "org.apache.pinot.perf.BenchmarkSplitPart.testSplitPartNew:gc.alloc.rate.norm":  56.043 B/op
     "org.apache.pinot.perf.BenchmarkSplitPart.testSplitPartNew:gc.alloc.rate.norm":  6616.001 B/op

     "org.apache.pinot.perf.BenchmarkSplitPart.testSplitPartOld:gc.alloc.rate.norm":  304.000 B/op
     "org.apache.pinot.perf.BenchmarkSplitPart.testSplitPartOld:gc.alloc.rate.norm":  3560.001 B/op
     "org.apache.pinot.perf.BenchmarkSplitPart.testSplitPartOld:gc.alloc.rate.norm":  6616.001 B/op
     "org.apache.pinot.perf.BenchmarkSplitPart.testSplitPartOld:gc.alloc.rate.norm":  760984.043 B/op
     "org.apache.pinot.perf.BenchmarkSplitPart.testSplitPartOld:gc.alloc.rate.norm":  760984.044 B/op
     "org.apache.pinot.perf.BenchmarkSplitPart.testSplitPartOld:gc.alloc.rate.norm":  760984.045 B/op
     "org.apache.pinot.perf.BenchmarkSplitPart.testSplitPartOld:gc.alloc.rate.norm":  760984.046 B/op

justahuman1 · 2026-02-07T20:52:46Z

pinot-common/src/main/java/org/apache/pinot/common/function/scalar/StringFunctions.java

   * TODO: Revisit if index should be one-based (both Presto and Postgres use one-based index, which starts with 1)
-   * @param input
-   * @param delimiter
+   * @param input the input String to be split into parts.


copied verbatim from the the overloaded splitPart function below to keep in sync.

justahuman1 · 2026-02-07T20:54:00Z

pinot-common/src/test/java/org/apache/pinot/common/function/scalar/StringFunctionsTest.java

        {"org.apache.pinot.common.function", ".", 3, 3, "common", "null"},
        {"+++++", "+", 0, 100, "", ""},
        {"+++++", "+", 1, 100, "null", "null"},
+        {"+++++org++apache++", "", 1, 100, "null", "null"},


These were missing cases I added to validate identical behavior to previous version

justahuman1 · 2026-02-07T20:54:47Z

pinot-perf/src/main/java/org/apache/pinot/perf/BenchmarkSplitPart.java

@@ -0,0 +1,160 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one


I can remove this if required. Added it for my own validation of the perf

codecov-commenter · 2026-02-13T01:54:41Z

Codecov Report

❌ Patch coverage is 92.98246% with 4 lines in your changes missing coverage. Please review.
✅ Project coverage is 63.33%. Comparing base (7eec9a7) to head (cb95a56).

Files with missing lines	Patch %	Lines
.../pinot/common/function/scalar/StringFunctions.java	92.98%	1 Missing and 3 partials ⚠️

Additional details and impacted files

@@             Coverage Diff              @@
##             master   #17660      +/-   ##
============================================
+ Coverage     63.29%   63.33%   +0.03%     
  Complexity     1466     1466              
============================================
  Files          3190     3190              
  Lines        191904   191955      +51     
  Branches      29390    29410      +20     
============================================
+ Hits         121474   121582     +108     
+ Misses        60940    60859      -81     
- Partials       9490     9514      +24

Flag	Coverage Δ
custom-integration1	`100.00% <ø> (ø)`
integration	`100.00% <ø> (ø)`
integration1	`100.00% <ø> (ø)`
integration2	`0.00% <ø> (ø)`
java-11	`63.29% <92.98%> (+0.04%)`	⬆️
java-21	`63.24% <92.98%> (-0.02%)`	⬇️
temurin	`63.33% <92.98%> (+0.03%)`	⬆️
unittests	`63.33% <92.98%> (+0.03%)`	⬆️
unittests1	`55.60% <92.98%> (+<0.01%)`	⬆️
unittests2	`34.30% <0.00%> (+0.03%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

justahuman1 · 2026-02-13T22:05:03Z

I will debug these tests and update the PR. I will also try to get the existing coverage to 100% so we can be confident before migrating

Jackie-Jiang · 2026-02-18T01:23:30Z

Try rebasing. The test failures might not be related

Updated the splitPart function to use direct string traversal instead of allocating a full String[] array, significantly reducing memory pressure in high-throughput query scenarios. Performance improvements: - Space complexity: O(n) → O(1) - Time complexity: O(n) for both implementations Added JMH benchmarks to demonstrate improvements and regressions. The primary regression is in the backward index case with a large index value. This is the uncommon case and maybe worth the tradeoff since the memory allocation in the common case is now significantly reduced.

justahuman1 · 2026-03-07T04:21:58Z

Done. I also added full test coverage which uncovered a behavior change with this new implementation, let me know your thoughts @Jackie-Jiang

Behavioral difference from old implementation

The new implementation differs from StringUtils.splitByWholeSeparator for empty input strings only. The old approach returns an empty array for "", treating it as having 0 fields.
The new implementation treats "" as having 1 field (the empty string itself), which is consistent with how non-empty strings without a matching delimiter are handled:

input	delimiter	index	old impl	new impl
`""`	`"."`	0	`"null"`	`""`
`""`	`"."`	-1	`"null"`	`""`
`""`	`"::"`	-1	`"null"`	`""`
`"a"`	`"."`	0	`"a"`	`"a"` (same)

I feel that the new implementation is more correct, rather than returning null. wdyt?

Copilot

Pull request overview

This PR optimizes the splitPart scalar function in pinot-common to avoid allocating a full String[] split result, reducing memory pressure for high-throughput query workloads. It also adds a JMH benchmark and expands unit test coverage for additional edge cases.

Changes:

Reimplemented StringFunctions.splitPart(input, delimiter, index) using direct string traversal (including a specialized negative-index fast path for single-char delimiters).
Added JMH benchmarks to compare the new vs old splitPart implementations.
Added many new splitPart unit test cases covering edge scenarios (empty delimiter, consecutive delimiters, leading/trailing delimiters, multi-char delimiters, empty input).

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

File	Description
`pinot-common/src/main/java/org/apache/pinot/common/function/scalar/StringFunctions.java`	Replaces `splitPart` implementation with allocation-reduced traversal logic and adds a negative-index optimized helper.
`pinot-common/src/test/java/org/apache/pinot/common/function/scalar/StringFunctionsTest.java`	Expands `splitPart` test matrix for more delimiter/index boundary cases.
`pinot-perf/src/main/java/org/apache/pinot/perf/BenchmarkSplitPart.java`	Introduces a JMH benchmark to measure old vs new `splitPart` performance across scenarios.

Copilot · 2026-03-08T01:43:43Z

pinot-perf/src/main/java/org/apache/pinot/perf/BenchmarkSplitPart.java

+        _delimiter = ":";
+        _input = buildString(50, repeatStr(_delimiter, 10));


In the multi_char_delim scenario, _delimiter is set to ":" but _input is built using repeatStr(_delimiter, 10) as the delimiter between fields. This means the benchmark isn't actually exercising a multi-character delimiter for splitPart (it passes a single-char delimiter that happens to appear consecutively), so the results can be misleading. Consider setting _delimiter to the repeated string (or use something like "::") and building _input with that same _delimiter.

Suggested change

_delimiter = ":";

_input = buildString(50, repeatStr(_delimiter, 10));

_delimiter = repeatStr(":", 10);

_input = buildString(50, _delimiter);

Copilot · 2026-03-08T01:43:44Z

pinot-perf/src/main/java/org/apache/pinot/perf/BenchmarkSplitPart.java

+      case "large_index":
+        // Extract 80th element from 100-field string
+        _delimiter = ",";
+        _input = buildString(10000, _delimiter);
+        _index = 790;
+        break;
+      case "negative_index":
+        // Extract last element from 100-field string
+        _delimiter = ",";
+        _input = buildString(100, _delimiter);
+        _index = -1;
+        break;
+      case "large_negative_index":
+        // Extract last element from 100-field string
+        _delimiter = ",";
+        _input = buildString(10000, _delimiter);
+        _index = -7242;
+        break;
+      case "adjacent_delimiters":
+        // String with many consecutive delimiters
+        _input = "field0+++field1+++field2+++field3";
+        _delimiter = "+";
+        _index = 1;
+        break;
+      case "many_fields":
+        // Extract 5th element from 1000-field string
+        _delimiter = ",";
+        _input = buildString(10000, _delimiter);
+        _index = 4;


Several scenario comments don't match the actual parameters (e.g., large_index/many_fields mention 100/1000 fields but build 10,000; large_index says 80th but uses index 790). Align the comments (or the parameters) so the benchmark scenarios remain self-explanatory.

Copilot · 2026-03-08T01:43:44Z

pinot-common/src/main/java/org/apache/pinot/common/function/scalar/StringFunctions.java

+      start += delimLen;
+    }
+    // optimization for negative index with single-char delimiter since common case
+    // multi-char delimiter with negative index can be handled in future since less common and more complex


-index will overflow when index == Integer.MIN_VALUE, causing incorrect behavior for that edge case (and potentially passing a negative value into the helper). Add an explicit guard for index == Integer.MIN_VALUE (e.g., return "null" early) or handle the absolute-value conversion without overflow.

Suggested change

// multi-char delimiter with negative index can be handled in future since less common and more complex

// multi-char delimiter with negative index can be handled in future since less common and more complex

if (index == Integer.MIN_VALUE) {

return "null";

}

justahuman1 commented Feb 7, 2026

View reviewed changes

justahuman1 force-pushed the split-part-malloc-fix branch from 33336f1 to 5a2b4f1 Compare February 7, 2026 20:53

justahuman1 commented Feb 7, 2026

View reviewed changes

justahuman1 mentioned this pull request Feb 7, 2026

Improve splitPart function #17362

Open

justahuman1 force-pushed the split-part-malloc-fix branch from 5a2b4f1 to 24644cf Compare February 7, 2026 21:38

justahuman1 changed the title ~~Update splitPart scalar function to reduce allocation~~ Optimize splitPart scalar function to reduce allocation Feb 10, 2026

Jackie-Jiang added enhancement performance labels Feb 13, 2026

justahuman1 added 3 commits March 6, 2026 20:15

add negative pass

66dddf9

Add comprehensive splitPart test coverage for edge cases

cb95a56

justahuman1 force-pushed the split-part-malloc-fix branch from b4272e2 to cb95a56 Compare March 7, 2026 04:16

xiangfu0 requested review from Copilot, deemoliu and xiangfu0 March 8, 2026 01:38

Copilot started reviewing on behalf of xiangfu0 March 8, 2026 01:39 View session

Copilot AI reviewed Mar 8, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize splitPart scalar function to reduce allocation#17660

Optimize splitPart scalar function to reduce allocation#17660
justahuman1 wants to merge 3 commits intoapache:masterfrom
justahuman1:split-part-malloc-fix

justahuman1 commented Feb 7, 2026 •

edited

Loading

Uh oh!

justahuman1 Feb 7, 2026

Uh oh!

justahuman1 Feb 7, 2026

Uh oh!

justahuman1 Feb 7, 2026

Uh oh!

codecov-commenter commented Feb 13, 2026 •

edited

Loading

Uh oh!

justahuman1 commented Feb 13, 2026

Uh oh!

Jackie-Jiang commented Feb 18, 2026

Uh oh!

justahuman1 commented Mar 7, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Mar 8, 2026

Uh oh!

Copilot AI Mar 8, 2026

Uh oh!

Copilot AI Mar 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

		@@ -0,0 +1,160 @@
		/**
		* Licensed to the Apache Software Foundation (ASF) under one

		_delimiter = ":";
		_input = buildString(50, repeatStr(_delimiter, 10));

Conversation

justahuman1 commented Feb 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

JMH Results

Allocation via -prof gc

Uh oh!

justahuman1 Feb 7, 2026

Choose a reason for hiding this comment

Uh oh!

justahuman1 Feb 7, 2026

Choose a reason for hiding this comment

Uh oh!

justahuman1 Feb 7, 2026

Choose a reason for hiding this comment

Uh oh!

codecov-commenter commented Feb 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

justahuman1 commented Feb 13, 2026

Uh oh!

Jackie-Jiang commented Feb 18, 2026

Uh oh!

justahuman1 commented Mar 7, 2026

Behavioral difference from old implementation

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Mar 8, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 8, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 8, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

justahuman1 commented Feb 7, 2026 •

edited

Loading

Allocation via `-prof gc`

codecov-commenter commented Feb 13, 2026 •

edited

Loading