Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why do QueryPlanner and PhysicalPlanner exist as independent concepts? #13943

Open
njsmith opened this issue Dec 30, 2024 · 1 comment
Open
Labels
enhancement New feature or request

Comments

@njsmith
Copy link

njsmith commented Dec 30, 2024

Is your feature request related to a problem or challenge?

I wrote a LogicalPlan::Extension and an ExtensionPlanner to convert it into a physical plan. Then I tried to figure out how to configure a SessionContext to use my extension planner. It took a little bit:

Step 1: let planner = DefaultPhysicalPlanner::with_extension_planners(vec![Arc::new(GatherPlanner)]) to get an object that impls PhysicalPlanner that understands my extension.
Step 2: squint at SessionStateBuilder and surrounding environs looking for how to give it a custom Arc<dyn PhysicalPlanner>.
Step 3: eventually realize that SessionStateBuilder actually wants you to give it a QueryPlanner instead.
Step 4: QueryPlanner appears to be a strict subset of PhysicalPlanner? So I guess there should be some blanket impl or something that lets me pass in my PhysicalPlanner as a QueryPlanner?
Step 5: No apparently there is no blanket impl, and DefaultQueryPlanner is a private struct, and it's just a trivial wrapper around DefaultPhysicalPlanner but without any configurability.
Conclusion (?): Everyone who wants to extend planning is expected to copy/paste this code to make their own dyn QueryPlanner that wraps DefaultPhysicalPlanner but with their own configuration.

Is that... right? I keep feeling like I must be missing something.

Describe the solution you'd like

I can think of a few options that would simplify:

  • Merge QueryPlanner and PhysicalPlanner traits into a single trait
  • Make PhysicalPlanner a sub-trait of QueryPlanner
  • Provide a blanket impl<T: PhysicalPlanner> QueryPlanner for T
  • impl QueryPlanner for DefaultPhysicalPlanner

This is ordered with things that would have simplified my life as a new user the most on the top, and the least disruptive to the existing codebase on the bottom. I'm not sure what constraints led to the current design so I can't really judge which tradeoff is best.

Describe alternatives you've considered

No response

Additional context

No response

@njsmith njsmith added the enhancement New feature or request label Dec 30, 2024
@alamb
Copy link
Contributor

alamb commented Dec 31, 2024

Thank you for this @njsmith -- I agree this current situation is confusing and seems like maybe is left over from some long ago time ago

I poked around a bit this morning and I agree with your options to simplify. It seems like DefaultQueryPlanner just passes directly through DefaultPhysicalPlanner

https://github.com/apache/datafusion/blob/e718c1a5c5770c071c9c2e14a7681a7f1a2f3f23/datafusion/core/src/execution/session_state.rs#L1935-L1934

Thus, perhaps we could deprecate / remove QueryPlanner and the various SessionState methods around that, and allow users to register their own PhysicalPlanner directly 🤔

They could still provide their own physical planner, but without having to add the extra level of indirection

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants