Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

design: A unified optimizer interface #20569

Merged

Conversation

mgree
Copy link
Contributor

@mgree mgree commented Jul 14, 2023

We have talked off and on about unifying the interface to the optimizer. This design doc discusses why that would be useful and what it would look like.

Markdown rendering of

Motivation

  • This PR refactors existing code.

Tips for reviewer

Conspicuously absent: the actual proposed interface. I think this is best discussed at the next on-site.

Checklist

  • This PR has adequate test coverage / QA involvement has been duly considered.
  • This PR has an associated up-to-date design doc, is a design doc (template), or is sufficiently small to not require a design.
  • If this PR evolves an existing $T ⇔ Proto$T mapping (possibly in a backwards-incompatible way), then it is tagged with a T-proto label.
  • If this PR will require changes to cloud orchestration, there is a companion cloud PR to account for those changes that is tagged with the release-blocker label (example).
  • This PR includes the following user-facing behavior changes:

@mgree mgree added A-docs Area: documentation A-optimization Area: query optimization and transformation A-compute Area: compute A-ADAPTER Topics related to the ADAPTER layer labels Jul 14, 2023
@aalexandrov
Copy link
Contributor

aalexandrov commented Aug 28, 2023

Adding a comment to put some extra items that I would like to discuss and nail in this doc:

  1. What kinds of context to we have (mutable vs read-only)?
  2. Who initializes and owns them (Coordinator, DataflowBuilder, ...)?

For example, we want to communicate notices the optimizer back to the caller (the Coordinator). For me, the ideal way to do this is to say that:

  1. The Coordinator initializes a piece of mutable context of type Vec<OptimizerNotice> that it can pass down.
  2. The various optimizer stages then populate this context.
  3. The Coordinator than interprets the context (e.g., by forwarding any notices to the user or by saving them on the side so we can expose them in the web console).

P.S.: Sadly, even if we had this the fact that HIR ⇒ MIR lowering and local MIR ⇒ MIR optimization happen before we enter the Coordinator leaves our hands a bit tied.

@aalexandrov aalexandrov force-pushed the optimizer-interface-design-doc branch from a2d6f8b to ad73d39 Compare October 3, 2023 11:53
@aalexandrov aalexandrov self-assigned this Oct 3, 2023
@aalexandrov aalexandrov marked this pull request as ready for review October 3, 2023 11:55
@aalexandrov aalexandrov force-pushed the optimizer-interface-design-doc branch 2 times, most recently from f5854ba to d21d02c Compare October 3, 2023 12:00
@aalexandrov
Copy link
Contributor

@mjibson / @ggevay / @mgree I've pushed a new version of the design doc along with an MVP PR in #22115.

If you agree with the approach and the proposed API I will start forking off separate PRs for each statement type along the lines suggested in the doc and demonstrated in #22115.

@aalexandrov aalexandrov force-pushed the optimizer-interface-design-doc branch from d21d02c to f5620f6 Compare October 3, 2023 12:08
its client sites ("bootstrap `T`", "execute `T`", and "explain `T`") therefore
is constantly at risk of getting out of sync. When this happens, it negatively
impacts the overall product experience. For example, in the past we had
situations with "`EXPLAIN` drift", where "explain `SELECT`" results wre not
Copy link
Contributor

@aalexandrov aalexandrov Oct 3, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW this is the case even now because "execute SELECT" statements produce a dataflow with an exported IndexDesc that we don't create in the "explain SELECT".

dataflow.export_index(
index_id,
IndexDesc {
on_id: view_id,
key: key.clone(),
},
typ.clone(),
);

It's not something that currently affects the EXPLAIN output, but still...

@aalexandrov aalexandrov force-pushed the optimizer-interface-design-doc branch 4 times, most recently from 0ed9a4d to 82f86ac Compare October 3, 2023 18:06
Copy link
Contributor

@maddyblue maddyblue left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So well written and explained. This looks great! Excited to see this happen.

doc/developer/design/20230714_optimizer_interface.md Outdated Show resolved Hide resolved
doc/developer/design/20230714_optimizer_interface.md Outdated Show resolved Hide resolved
Copy link
Contributor

@teskje teskje left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Copy link
Contributor

@ggevay ggevay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great!

doc/developer/design/20230714_optimizer_interface.md Outdated Show resolved Hide resolved
doc/developer/design/20230714_optimizer_interface.md Outdated Show resolved Hide resolved
doc/developer/design/20230714_optimizer_interface.md Outdated Show resolved Hide resolved
doc/developer/design/20230714_optimizer_interface.md Outdated Show resolved Hide resolved
Copy link
Contributor Author

@mgree mgree left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for all this work and extra detail!

doc/developer/design/20230714_optimizer_interface.md Outdated Show resolved Hide resolved
@aalexandrov aalexandrov force-pushed the optimizer-interface-design-doc branch from 82f86ac to d7ce174 Compare October 9, 2023 07:35
aalexandrov added a commit to aalexandrov/materialize that referenced this pull request Nov 7, 2023
Introduce `OptimizePeek` in the `sequence_peek` optimization paths in
the `Coordinator`.

See the rollout proposal in MaterializeInc#20569 for details.
aalexandrov added a commit to aalexandrov/materialize that referenced this pull request Nov 7, 2023
Introduce `OptimizePeek` in the `sequence_explain_plan` optimization
path for `Query` explainees in the `Coordinator`.

See the rollout proposal in MaterializeInc#20569 for details.
aalexandrov added a commit to aalexandrov/materialize that referenced this pull request Nov 7, 2023
Fork the old implementations of

- `Coordinator::sequence_peek*`, and
- `Coordinator::explain_query_optimizer_pipeline`,

by duplicating code and dispatching between the old and the new code
path based on the value of the `enable_unified_optimizer_api` feature
flag. At the moment the two code paths are still identical, but this
will change with the next commit.

See the rollout proposal in MaterializeInc#20569 for details.
aalexandrov added a commit to aalexandrov/materialize that referenced this pull request Nov 7, 2023
Introduce `OptimizePeek` in the `sequence_peek` optimization paths in
the `Coordinator`.

See the rollout proposal in MaterializeInc#20569 for details.
aalexandrov added a commit to aalexandrov/materialize that referenced this pull request Nov 7, 2023
Introduce `OptimizePeek` in the `sequence_explain_plan` optimization
path for `Query` explainees in the `Coordinator`.

See the rollout proposal in MaterializeInc#20569 for details.
aalexandrov added a commit to aalexandrov/materialize that referenced this pull request Nov 7, 2023
Fork the old implementations of

- `Coordinator::sequence_peek*`, and
- `Coordinator::explain_query_optimizer_pipeline`,

by duplicating code and dispatching between the old and the new code
path based on the value of the `enable_unified_optimizer_api` feature
flag. At the moment the two code paths are still identical, but this
will change with the next commit.

See the rollout proposal in MaterializeInc#20569 for details.
aalexandrov added a commit to aalexandrov/materialize that referenced this pull request Nov 7, 2023
Introduce `OptimizePeek` in the `sequence_peek` optimization paths in
the `Coordinator`.

See the rollout proposal in MaterializeInc#20569 for details.
aalexandrov added a commit to aalexandrov/materialize that referenced this pull request Nov 7, 2023
Introduce `OptimizePeek` in the `sequence_explain_plan` optimization
path for `Query` explainees in the `Coordinator`.

See the rollout proposal in MaterializeInc#20569 for details.
aalexandrov added a commit to aalexandrov/materialize that referenced this pull request Nov 7, 2023
Fork the old implementations of

- `Coordinator::sequence_peek*`, and
- `Coordinator::explain_query_optimizer_pipeline`,

by duplicating code and dispatching between the old and the new code
path based on the value of the `enable_unified_optimizer_api` feature
flag. At the moment the two code paths are still identical, but this
will change with the next commit.

See the rollout proposal in MaterializeInc#20569 for details.
aalexandrov added a commit to aalexandrov/materialize that referenced this pull request Nov 7, 2023
Introduce `OptimizePeek` in the `sequence_peek` optimization paths in
the `Coordinator`.

See the rollout proposal in MaterializeInc#20569 for details.
aalexandrov added a commit to aalexandrov/materialize that referenced this pull request Nov 7, 2023
Introduce `OptimizePeek` in the `sequence_explain_plan` optimization
path for `Query` explainees in the `Coordinator`.

See the rollout proposal in MaterializeInc#20569 for details.
aalexandrov added a commit to aalexandrov/materialize that referenced this pull request Nov 7, 2023
Fork the old implementations of

- `Coordinator::sequence_peek*`, and
- `Coordinator::explain_query_optimizer_pipeline`,

by duplicating code and dispatching between the old and the new code
path based on the value of the `enable_unified_optimizer_api` feature
flag. At the moment the two code paths are still identical, but this
will change with the next commit.

See the rollout proposal in MaterializeInc#20569 for details.
aalexandrov added a commit to aalexandrov/materialize that referenced this pull request Nov 7, 2023
Introduce `OptimizePeek` in the `sequence_peek` optimization paths in
the `Coordinator`.

See the rollout proposal in MaterializeInc#20569 for details.
aalexandrov added a commit to aalexandrov/materialize that referenced this pull request Nov 7, 2023
Introduce `OptimizePeek` in the `sequence_explain_plan` optimization
path for `Query` explainees in the `Coordinator`.

See the rollout proposal in MaterializeInc#20569 for details.
aalexandrov added a commit to aalexandrov/materialize that referenced this pull request Nov 7, 2023
Fork the old implementations of

- `Coordinator::sequence_peek*`, and
- `Coordinator::explain_query_optimizer_pipeline`,

by duplicating code and dispatching between the old and the new code
path based on the value of the `enable_unified_optimizer_api` feature
flag. At the moment the two code paths are still identical, but this
will change with the next commit.

See the rollout proposal in MaterializeInc#20569 for details.
aalexandrov added a commit to aalexandrov/materialize that referenced this pull request Nov 7, 2023
Introduce `OptimizePeek` in the `sequence_peek` optimization paths in
the `Coordinator`.

See the rollout proposal in MaterializeInc#20569 for details.
aalexandrov added a commit to aalexandrov/materialize that referenced this pull request Nov 7, 2023
Introduce `OptimizePeek` in the `sequence_explain_plan` optimization
path for `Query` explainees in the `Coordinator`.

See the rollout proposal in MaterializeInc#20569 for details.
aalexandrov added a commit to aalexandrov/materialize that referenced this pull request Nov 7, 2023
Fork the old implementations of

- `Coordinator::sequence_peek*`, and
- `Coordinator::explain_query_optimizer_pipeline`,

by duplicating code and dispatching between the old and the new code
path based on the value of the `enable_unified_optimizer_api` feature
flag. At the moment the two code paths are still identical, but this
will change with the next commit.

See the rollout proposal in MaterializeInc#20569 for details.
aalexandrov added a commit to aalexandrov/materialize that referenced this pull request Nov 7, 2023
Introduce `OptimizePeek` in the `sequence_peek` optimization paths in
the `Coordinator`.

See the rollout proposal in MaterializeInc#20569 for details.
aalexandrov added a commit to aalexandrov/materialize that referenced this pull request Nov 7, 2023
Introduce `OptimizePeek` in the `sequence_explain_plan` optimization
path for `Query` explainees in the `Coordinator`.

See the rollout proposal in MaterializeInc#20569 for details.
aalexandrov added a commit to aalexandrov/materialize that referenced this pull request Nov 7, 2023
Fork the old implementations of

- `Coordinator::sequence_peek*`, and
- `Coordinator::explain_query_optimizer_pipeline`,

by duplicating code and dispatching between the old and the new code
path based on the value of the `enable_unified_optimizer_api` feature
flag. At the moment the two code paths are still identical, but this
will change with the next commit.

See the rollout proposal in MaterializeInc#20569 for details.
aalexandrov added a commit to aalexandrov/materialize that referenced this pull request Nov 7, 2023
Introduce `OptimizePeek` in the `sequence_peek` optimization paths in
the `Coordinator`.

See the rollout proposal in MaterializeInc#20569 for details.
aalexandrov added a commit to aalexandrov/materialize that referenced this pull request Nov 7, 2023
Introduce `OptimizePeek` in the `sequence_explain_plan` optimization
path for `Query` explainees in the `Coordinator`.

See the rollout proposal in MaterializeInc#20569 for details.
matthewarthur pushed a commit that referenced this pull request Nov 7, 2023
Introduce `OptimizeView` in the `CREATE INDEX` optimization paths in
the `Coordinator`, `Catalog`, and `CatalogState` implementations:

- `CatalogState::parse_view_item` (the `Plan::CreateView` part),
- `CatalogState::parse_item` (the `Plan::CreateView` part).
- `Coordinator::sequence_create_view`.
- `Coordinator::sequence_explain_timestamp_begin_inner`.
- `Coordinator::sequence_insert`.

See the rollout proposal in #20569 for details.
matthewarthur pushed a commit that referenced this pull request Nov 7, 2023
Fork the old implementations of `Coordinator::sequence_subscribe` by
duplicating code and dispatching between the old and the new code path
based on the value of the `enable_unified_optimizer_api` feature flag.
At the moment the two code paths are still identical, but this will
change with the next commit.

See the rollout proposal in #20569 for details.
matthewarthur pushed a commit that referenced this pull request Nov 7, 2023
Introduce `OptimizeSubscribe` in the `SUBSCRIBE` optimization paths in
the `Coordinator::sequence_subscribe` implementation.

See the rollout proposal in #20569 for details.
matthewarthur pushed a commit that referenced this pull request Nov 7, 2023
Fork the old implementations of

- `Coordinator::sequence_peek*`, and
- `Coordinator::explain_query_optimizer_pipeline`,

by duplicating code and dispatching between the old and the new code
path based on the value of the `enable_unified_optimizer_api` feature
flag. At the moment the two code paths are still identical, but this
will change with the next commit.

See the rollout proposal in #20569 for details.
matthewarthur pushed a commit that referenced this pull request Nov 7, 2023
Introduce `OptimizePeek` in the `sequence_peek` optimization paths in
the `Coordinator`.

See the rollout proposal in #20569 for details.
matthewarthur pushed a commit that referenced this pull request Nov 7, 2023
Introduce `OptimizePeek` in the `sequence_explain_plan` optimization
path for `Query` explainees in the `Coordinator`.

See the rollout proposal in #20569 for details.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-ADAPTER Topics related to the ADAPTER layer A-compute Area: compute A-docs Area: documentation A-optimization Area: query optimization and transformation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants