Optimize splitPart scalar function to reduce allocation#17660
Optimize splitPart scalar function to reduce allocation#17660justahuman1 wants to merge 3 commits intoapache:masterfrom
Conversation
| * TODO: Revisit if index should be one-based (both Presto and Postgres use one-based index, which starts with 1) | ||
| * @param input | ||
| * @param delimiter | ||
| * @param input the input String to be split into parts. |
There was a problem hiding this comment.
copied verbatim from the the overloaded splitPart function below to keep in sync.
33336f1 to
5a2b4f1
Compare
| {"org.apache.pinot.common.function", ".", 3, 3, "common", "null"}, | ||
| {"+++++", "+", 0, 100, "", ""}, | ||
| {"+++++", "+", 1, 100, "null", "null"}, | ||
| {"+++++org++apache++", "", 1, 100, "null", "null"}, |
There was a problem hiding this comment.
These were missing cases I added to validate identical behavior to previous version
| @@ -0,0 +1,160 @@ | |||
| /** | |||
| * Licensed to the Apache Software Foundation (ASF) under one | |||
There was a problem hiding this comment.
I can remove this if required. Added it for my own validation of the perf
5a2b4f1 to
24644cf
Compare
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #17660 +/- ##
============================================
+ Coverage 63.29% 63.33% +0.03%
Complexity 1466 1466
============================================
Files 3190 3190
Lines 191904 191955 +51
Branches 29390 29410 +20
============================================
+ Hits 121474 121582 +108
+ Misses 60940 60859 -81
- Partials 9490 9514 +24
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
I will debug these tests and update the PR. I will also try to get the existing coverage to 100% so we can be confident before migrating |
|
Try rebasing. The test failures might not be related |
Updated the splitPart function to use direct string traversal instead of allocating a full String[] array, significantly reducing memory pressure in high-throughput query scenarios. Performance improvements: - Space complexity: O(n) → O(1) - Time complexity: O(n) for both implementations Added JMH benchmarks to demonstrate improvements and regressions. The primary regression is in the backward index case with a large index value. This is the uncommon case and maybe worth the tradeoff since the memory allocation in the common case is now significantly reduced.
b4272e2 to
cb95a56
Compare
|
Done. I also added full test coverage which uncovered a behavior change with this new implementation, let me know your thoughts @Jackie-Jiang Behavioral difference from old implementationThe new implementation differs from
I feel that the new implementation is more correct, rather than returning null. wdyt? |
There was a problem hiding this comment.
Pull request overview
This PR optimizes the splitPart scalar function in pinot-common to avoid allocating a full String[] split result, reducing memory pressure for high-throughput query workloads. It also adds a JMH benchmark and expands unit test coverage for additional edge cases.
Changes:
- Reimplemented
StringFunctions.splitPart(input, delimiter, index)using direct string traversal (including a specialized negative-index fast path for single-char delimiters). - Added JMH benchmarks to compare the new vs old
splitPartimplementations. - Added many new
splitPartunit test cases covering edge scenarios (empty delimiter, consecutive delimiters, leading/trailing delimiters, multi-char delimiters, empty input).
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
pinot-common/src/main/java/org/apache/pinot/common/function/scalar/StringFunctions.java |
Replaces splitPart implementation with allocation-reduced traversal logic and adds a negative-index optimized helper. |
pinot-common/src/test/java/org/apache/pinot/common/function/scalar/StringFunctionsTest.java |
Expands splitPart test matrix for more delimiter/index boundary cases. |
pinot-perf/src/main/java/org/apache/pinot/perf/BenchmarkSplitPart.java |
Introduces a JMH benchmark to measure old vs new splitPart performance across scenarios. |
| _delimiter = ":"; | ||
| _input = buildString(50, repeatStr(_delimiter, 10)); |
There was a problem hiding this comment.
In the multi_char_delim scenario, _delimiter is set to ":" but _input is built using repeatStr(_delimiter, 10) as the delimiter between fields. This means the benchmark isn't actually exercising a multi-character delimiter for splitPart (it passes a single-char delimiter that happens to appear consecutively), so the results can be misleading. Consider setting _delimiter to the repeated string (or use something like "::") and building _input with that same _delimiter.
| _delimiter = ":"; | |
| _input = buildString(50, repeatStr(_delimiter, 10)); | |
| _delimiter = repeatStr(":", 10); | |
| _input = buildString(50, _delimiter); |
| case "large_index": | ||
| // Extract 80th element from 100-field string | ||
| _delimiter = ","; | ||
| _input = buildString(10000, _delimiter); | ||
| _index = 790; | ||
| break; | ||
| case "negative_index": | ||
| // Extract last element from 100-field string | ||
| _delimiter = ","; | ||
| _input = buildString(100, _delimiter); | ||
| _index = -1; | ||
| break; | ||
| case "large_negative_index": | ||
| // Extract last element from 100-field string | ||
| _delimiter = ","; | ||
| _input = buildString(10000, _delimiter); | ||
| _index = -7242; | ||
| break; | ||
| case "adjacent_delimiters": | ||
| // String with many consecutive delimiters | ||
| _input = "field0+++field1+++field2+++field3"; | ||
| _delimiter = "+"; | ||
| _index = 1; | ||
| break; | ||
| case "many_fields": | ||
| // Extract 5th element from 1000-field string | ||
| _delimiter = ","; | ||
| _input = buildString(10000, _delimiter); | ||
| _index = 4; |
There was a problem hiding this comment.
Several scenario comments don't match the actual parameters (e.g., large_index/many_fields mention 100/1000 fields but build 10,000; large_index says 80th but uses index 790). Align the comments (or the parameters) so the benchmark scenarios remain self-explanatory.
| start += delimLen; | ||
| } | ||
| // optimization for negative index with single-char delimiter since common case | ||
| // multi-char delimiter with negative index can be handled in future since less common and more complex |
There was a problem hiding this comment.
-index will overflow when index == Integer.MIN_VALUE, causing incorrect behavior for that edge case (and potentially passing a negative value into the helper). Add an explicit guard for index == Integer.MIN_VALUE (e.g., return "null" early) or handle the absolute-value conversion without overflow.
| // multi-char delimiter with negative index can be handled in future since less common and more complex | |
| // multi-char delimiter with negative index can be handled in future since less common and more complex | |
| if (index == Integer.MIN_VALUE) { | |
| return "null"; | |
| } |
Optimized the splitPart function to use direct string traversal instead of allocating a full String[] array, significantly reducing memory pressure in high-throughput query scenarios.
Performance improvements:
Added JMH benchmarks to demonstrate improvements and regressions. The primary regression is in the backward index case so I added
splitPartNegativeIdxSingleCharDelim(b4272e2) which increases complexity but makes the kernel much fasterAddresses #17362
JMH Results
Yeah, the common cases are more than 10x faster 🚀 . The original results showed a regression with
negative_index, hence b4272e2. However, this code may be harder to manage so that is the trade-off. The below results are from all commits in the PRAllocation via
-prof gc