Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

explain: support EXPLAIN CREATE INDEX #22021

Merged
merged 5 commits into from
Sep 29, 2023

Conversation

aalexandrov
Copy link
Contributor

@aalexandrov aalexandrov commented Sep 27, 2023

Add support for

EXPLAIN CREATE INDEX

syntax.

Motivation

  • This PR adds a known-desirable feature.

Closes MaterializeInc/database-issues#5301.

Tips for reviewer

  • Structurally this PR is very similar to explain: support EXPLAIN CREATE MATERIALIZED VIEW #21973.
  • See individual commit messages for details.
  • For test coverage I cloned tpch_select.slt and changed the workload to use CREATE DEFAULT INDEX ON $qXX where $qXX is a view for query XX (you can two tpch_*.slt files to see what changed).

@ggevay note that the "index export" dataflows use ReadStorage, e.g.

materialize.public.q07_primary_idx:
  ArrangeBy keys=[[supp_nation, cust_nation, l_year]] // { arity: 4 }
    ReadStorage materialize.public.q07 // { arity: 4 }

This is surprising to me because it somehow alludes that the materialize.public.q07 source is written to S3 with a perist sink. Maybe this is an error in the usage type inference code?

Checklist

  • This PR has adequate test coverage / QA involvement has been duly considered.
  • This PR has an associated up-to-date design doc, is a design doc (template), or is sufficiently small to not require a design.
  • If this PR evolves an existing $T ⇔ Proto$T mapping (possibly in a backwards-incompatible way), then it is tagged with a T-proto label.
  • If this PR will require changes to cloud orchestration or tests, there is a companion cloud PR to account for those changes that is tagged with the release-blocker label (example).
  • This PR includes the following user-facing behavior changes:

@aalexandrov aalexandrov requested review from ggevay and a team September 27, 2023 19:59
@aalexandrov aalexandrov requested a review from a team as a code owner September 27, 2023 19:59
@aalexandrov aalexandrov requested review from andrewrodriguez-m and removed request for a team and andrewrodriguez-m September 27, 2023 19:59
@aalexandrov aalexandrov self-assigned this Sep 27, 2023
@aalexandrov aalexandrov added the A-ADAPTER Topics related to the ADAPTER layer label Sep 27, 2023
@aalexandrov aalexandrov force-pushed the issue_18089 branch 4 times, most recently from 1219030 to 4a1a4da Compare September 27, 2023 22:13
Copy link
Contributor

@def- def- left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Coverage mostly looks good: https://buildkite.com/materialize/coverage/builds/236#018ad8d2-4da7-4f15-b52a-66c29d6d8299
I'll add one datadriven test for the BROKEN syntax.

E: Into<AdapterError>,
{
if guard {
let r: Result<Result<R, E>, _> = mz_ore::panic::catch_unwind(AssertUnwindSafe(f));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we have a test case that exercises this? Or does mz_panic not cause the panic in the right stage?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can only test catch_unwind(broken, "global", f) and catch_unwind(broken, "local", f) calls with mz_panic(). I'm not aware of a good way to test hir_to_mir and finalize_dataflow stages.

Copy link
Contributor Author

@aalexandrov aalexandrov Sep 28, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added some EXPLAIN BROKEN <statement> SLT tests in a new commit.

Copy link
Contributor Author

@aalexandrov aalexandrov Sep 28, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed the EXPLAIN BROKEN <statement> SLT tests because they are breaking the CI analysis step (the panic and ERROR messages are flagged). If we have a fix for that and I'll add it back.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added the EXPLAIN BROKEN back with the CI analysis fix.

@aalexandrov aalexandrov force-pushed the issue_18089 branch 4 times, most recently from 5c0bbdd to f479bf6 Compare September 28, 2023 17:53
Copy link
Contributor

@ggevay ggevay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, just minor comments

note that the "index export" dataflows use ReadStorage, e.g.
...
This is surprising to me because it somehow alludes that the materialize.public.q07 source is written to S3 with a perist sink. Maybe this is an error in the usage type inference code?

Oh, good catch! This should be a different category from all the existing Get printings (AccessStrategy): not a local Get, not a Persist read, not an existing index read. It's a read of a GlobalId, but it's still a read of something from the same dataflow, just from a different item in objects_to_build. I've created an issue: https://github.com/MaterializeInc/materialize/issues/22064

src/sql/src/plan.rs Outdated Show resolved Hide resolved
src/adapter/src/coord/sequencer/inner.rs Outdated Show resolved Hide resolved
}
}

// Execute the `optimize/finalize_dataflow` stage.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

optimize/finalize_dataflow is the Plan::finalize_dataflow fn, but this call is to the Coordinator::finalize_dataflow fn, which does a few more things before calling the other one.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it preps up the dataflow as_of before calling Plan::finalize_dataflow (which is the method that produces the span. Conceptually the comment makes sense though (the stage is executed by the commented code snippet), and I would rather keep this consistent across all Coordinator::explain_ methods, as this code will hopefully be removed in the next weeks.

src/adapter/src/coord/sequencer/inner.rs Outdated Show resolved Hide resolved
});

// Global optimziation
// -------------------
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sequence_create_index has a ensure_cluster_can_host_compute_item call. Is it ok that we are not calling this here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are also not calling this in explain_materialized_view, and there are other validation steps that we are not doing (like validate_timeline_context and validate_system_column_references).

Overall I would prefer to tackle this problem after we unify the optimizer interface between the two methods because then all other differences will become easier to track down.

src/adapter/src/coord/sequencer/inner.rs Show resolved Hide resolved
src/adapter/src/coord/sequencer/inner.rs Show resolved Hide resolved
test/sqllogictest/explain/decorrelated_plan_as_text.slt Outdated Show resolved Hide resolved
test/sqllogictest/explain/raw_plan_as_text.slt Outdated Show resolved Hide resolved
@aalexandrov aalexandrov merged commit 7632dbd into MaterializeInc:main Sep 29, 2023
@aalexandrov aalexandrov deleted the issue_18089 branch September 29, 2023 15:39
aalexandrov added a commit to aalexandrov/materialize that referenced this pull request Oct 5, 2023
Update the docs to add the newly supported syntax introduced in
some recently merged PRs (MaterializeInc#21708, MaterializeInc#21973, and MaterializeInc#22021).
aalexandrov added a commit to aalexandrov/materialize that referenced this pull request Oct 5, 2023
Update the docs to add the newly supported syntax introduced in
some recently merged PRs (MaterializeInc#21708, MaterializeInc#21973, and MaterializeInc#22021).
@aalexandrov aalexandrov mentioned this pull request Oct 5, 2023
5 tasks
aalexandrov added a commit to aalexandrov/materialize that referenced this pull request Oct 5, 2023
Update the docs to add the newly supported syntax introduced in
some recently merged PRs (MaterializeInc#21708, MaterializeInc#21973, and MaterializeInc#22021).
aalexandrov added a commit to aalexandrov/materialize that referenced this pull request Oct 5, 2023
Update the docs to add the newly supported syntax introduced in
some recently merged PRs (MaterializeInc#21708, MaterializeInc#21973, and MaterializeInc#22021).
aalexandrov added a commit to aalexandrov/materialize that referenced this pull request Oct 5, 2023
Update the docs to add the newly supported syntax introduced in
some recently merged PRs (MaterializeInc#21708, MaterializeInc#21973, and MaterializeInc#22021).
aalexandrov added a commit to aalexandrov/materialize that referenced this pull request Oct 5, 2023
Update the docs to add the newly supported syntax introduced in
some recently merged PRs (MaterializeInc#21708, MaterializeInc#21973, and MaterializeInc#22021).
aalexandrov added a commit to aalexandrov/materialize that referenced this pull request Oct 5, 2023
Update the docs to add the newly supported syntax introduced in
some recently merged PRs (MaterializeInc#21708, MaterializeInc#21973, and MaterializeInc#22021).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-ADAPTER Topics related to the ADAPTER layer
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants