-
Notifications
You must be signed in to change notification settings - Fork 1
inital version for reverse row groups #26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: branch-51
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR implements Phase 1 of sort pushdown optimization to improve TopK query performance. When a query requests data in reverse order of a Parquet file's natural ordering, the optimizer now enables reverse row group scanning, which allows early termination in TopK queries while keeping the Sort operator for correctness.
Key changes:
- Adds
enable_sort_pushdownconfiguration option (default: true) - Implements reverse row group scanning for Parquet files
- Returns inexact ordering to enable TopK early termination benefits
- Adds comprehensive test coverage across multiple file formats
Reviewed changes
Copilot reviewed 28 out of 29 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
| docs/source/user-guide/configs.md | Documents new configuration options including enable_sort_pushdown, force_filter_selections, enable_ansi_mode, and hash join InList pushdown settings |
| datafusion/common/src/config.rs | Adds enable_sort_pushdown configuration option with detailed documentation |
| datafusion/physical-optimizer/src/pushdown_sort.rs | Implements the PushdownSort optimizer rule that detects SortExec nodes and attempts to push sort requirements down to data sources |
| datafusion/physical-plan/src/sort_pushdown.rs | Defines SortOrderPushdownResult enum for communicating sort pushdown results (Exact, Inexact, Unsupported) |
| datafusion/physical-plan/src/execution_plan.rs | Adds try_pushdown_sort trait method to ExecutionPlan for sort optimization |
| datafusion/datasource-parquet/src/source.rs | Implements reverse row group scanning logic in ParquetSource with reverse_row_groups field |
| datafusion/datasource-parquet/src/sort.rs | Implements reverse_row_selection function to adjust row selections for reversed row group order |
| datafusion/datasource-parquet/src/opener.rs | Integrates reverse scanning into ParquetOpener using PreparedAccessPlan |
| datafusion/physical-expr-common/src/sort_expr.rs | Adds is_reverse and is_reversed_sort_options helpers for detecting reversed orderings |
| datafusion/sqllogictest/test_files/*.slt | Comprehensive SQL logic tests validating reverse scan behavior with various scenarios |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
fd45ae8 to
4ed0668
Compare
…#19557) - Closes [apache#19535](apache#19535) Reverse row selection should respect the row group index, this PR will fix the issue. Reverse row selection should respect the row group index, this PR will fix the issue. Yes No (cherry picked from commit 27de50d)
## Which issue does this PR close? Add sorted data benchmark. - Closes[ apache#18976](apache#18976) ## Rationale for this change <!-- Why are you proposing this change? If this is already explained clearly in the issue then this section is not needed. Explaining clearly why changes are proposed helps reviewers understand your changes and offer better suggestions for fixes. --> ## What changes are included in this PR? <!-- There is no need to duplicate the description in the issue here but it is sometimes worth providing a summary of the individual changes in this PR. --> ## Are these changes tested? Yes, test results for reverse parquet PR, it's 30X faster than main branch for sorted data: apache#18817 ```rust Running `/Users/zhuqi/arrow-datafusion/target/release/dfbench clickbench --iterations 5 --path /Users/zhuqi/arrow-datafusion/benchmarks/data/hits_0_sorted.parquet --queries-path /Users/zhuqi/arrow-datafusion/benchmarks/queries/clickbench/queries/sorted_data --sorted-by EventTime --sort-order ASC -o /Users/zhuqi/arrow-datafusion/benchmarks/results/reverse_parquet/data_sorted_clickbench.json` Running benchmarks with the following options: RunOpt { query: None, pushdown: false, common: CommonOpt { iterations: 5, partitions: None, batch_size: None, mem_pool_type: "fair", memory_limit: None, sort_spill_reservation_bytes: None, debug: false }, path: "/Users/zhuqi/arrow-datafusion/benchmarks/data/hits_0_sorted.parquet", queries_path: "/Users/zhuqi/arrow-datafusion/benchmarks/queries/clickbench/queries/sorted_data", output_path: Some("/Users/zhuqi/arrow-datafusion/benchmarks/results/reverse_parquet/data_sorted_clickbench.json"), sorted_by: Some("EventTime"), sort_order: "ASC" }⚠️ Forcing target_partitions=1 to preserve sort order⚠️ (Because we want to get the pure performance benefit of sorted data to compare) 📊 Session config target_partitions: 1 Registering table with sort order: EventTime ASC Executing: CREATE EXTERNAL TABLE hits STORED AS PARQUET LOCATION '/Users/zhuqi/arrow-datafusion/benchmarks/data/hits_0_sorted.parquet' WITH ORDER ("EventTime" ASC) Q0: -- Must set for ClickBench hits_partitioned dataset. See apache#16591 -- set datafusion.execution.parquet.binary_as_string = true SELECT * FROM hits ORDER BY "EventTime" DESC limit 10; Query 0 iteration 0 took 14.7 ms and returned 10 rows Query 0 iteration 1 took 10.2 ms and returned 10 rows Query 0 iteration 2 took 8.7 ms and returned 10 rows Query 0 iteration 3 took 7.9 ms and returned 10 rows Query 0 iteration 4 took 7.9 ms and returned 10 rows Query 0 avg time: 9.85 ms + set +x Done ``` And the main branch result: ```rust Running `/Users/zhuqi/arrow-datafusion/target/release/dfbench clickbench --iterations 5 --path /Users/zhuqi/arrow-datafusion/benchmarks/data/hits_0_sorted.parquet --queries-path /Users/zhuqi/arrow-datafusion/benchmarks/queries/clickbench/queries/sorted_data --sorted-by EventTime --sort-order ASC -o /Users/zhuqi/arrow-datafusion/benchmarks/results/issue_18976/data_sorted_clickbench.json` Running benchmarks with the following options: RunOpt { query: None, pushdown: false, common: CommonOpt { iterations: 5, partitions: None, batch_size: None, mem_pool_type: "fair", memory_limit: None, sort_spill_reservation_bytes: None, debug: false }, path: "/Users/zhuqi/arrow-datafusion/benchmarks/data/hits_0_sorted.parquet", queries_path: "/Users/zhuqi/arrow-datafusion/benchmarks/queries/clickbench/queries/sorted_data", output_path: Some("/Users/zhuqi/arrow-datafusion/benchmarks/results/issue_18976/data_sorted_clickbench.json"), sorted_by: Some("EventTime"), sort_order: "ASC" }⚠️ Forcing target_partitions=1 to preserve sort order⚠️ (Because we want to get the pure performance benefit of sorted data to compare) 📊 Session config target_partitions: 1 Registering table with sort order: EventTime ASC Executing: CREATE EXTERNAL TABLE hits STORED AS PARQUET LOCATION '/Users/zhuqi/arrow-datafusion/benchmarks/data/hits_0_sorted.parquet' WITH ORDER ("EventTime" ASC) Q0: -- Must set for ClickBench hits_partitioned dataset. See apache#16591 -- set datafusion.execution.parquet.binary_as_string = true SELECT * FROM hits ORDER BY "EventTime" DESC limit 10; Query 0 iteration 0 took 331.1 ms and returned 10 rows Query 0 iteration 1 took 286.0 ms and returned 10 rows Query 0 iteration 2 took 283.3 ms and returned 10 rows Query 0 iteration 3 took 283.8 ms and returned 10 rows Query 0 iteration 4 took 286.5 ms and returned 10 rows Query 0 avg time: 294.13 ms + set +x Done ``` ## Are there any user-facing changes? <!-- If there are user-facing changes then we may require documentation to be updated before approving the PR. --> <!-- If there are any breaking changes to public APIs, please add the `api change` label. --> --------- Co-authored-by: Martin Grigorov <[email protected]> Co-authored-by: Yongting You <[email protected]> Co-authored-by: Copilot <[email protected]> Co-authored-by: Andrew Lamb <[email protected]> (cherry picked from commit cde6dfa)
Which issue does this PR close?
Rationale for this change
What changes are included in this PR?
Are these changes tested?
Are there any user-facing changes?