benchmark: Update startup benchmarks #29767

jkosh44 · 2024-09-27T18:16:52Z

This commit updates the StartupLoaded feature benchmarks to create
a greater number of complex objects. We've noticed a significant amount
of time during startup is spent in optimization, so we want to make
sure that this benchmark captures that.

Works towards resolving #MaterializeInc/database-issues/8384

Motivation

Improves test

Checklist

This PR has adequate test coverage / QA involvement has been duly considered. (trigger-ci for additional test/nightly runs)
This PR has an associated up-to-date design doc, is a design doc (template), or is sufficiently small to not require a design.
If this PR evolves an existing $T ⇔ Proto$T mapping (possibly in a backwards-incompatible way), then it is tagged with a T-proto label.
If this PR will require changes to cloud orchestration or tests, there is a companion cloud PR to account for those changes that is tagged with the release-blocker label (example).
If this PR includes major user-facing behavior changes, I have pinged the relevant PM to schedule a changelog post.

This commit updates the `StartupLoaded` feature benchmarks to create a greater number of complex objects. We've noticed a significant amount of time during startup is spent in optimization, so we want to make sure that this benchmark captures that. Works towards resolving #MaterializeInc/database-issues/8384

jkosh44 · 2024-10-04T15:40:24Z

misc/python/materialize/feature_benchmark/scenarios/benchmark_main.py

+
+        create_tpch_tables = "\n".join(
+            f"""
+> {newline.sub(" ", comment.sub("", ddl))}


It took me a long time to come up with the right regex magic for TD to accept these queries. If there's a better way then please let me know.

Generally testdrive is not great for programmatically generating, I might have just pasted the queries in directly (if they are not too huge).

It's roughly 1000 lines of SQL

Ouch, then makes sense to do it this way.

def-

I think this makes the scenario much more interesting.

You could argue that we might want to keep the old version and add this as a new one, but I'm ok with not having too many scenarios, so fine by me.

def- · 2024-10-04T16:13:47Z

misc/python/materialize/feature_benchmark/scenarios/benchmark_main.py

@@ -1833,7 +1835,7 @@ def benchmark(self) -> BenchmarkingSequence:
 class StartupLoaded(Scenario):


You should bump the version of this scenario because it's no longer comparable to old results:

def version(self) -> ScenarioVersion: return ScenarioVersion.create(1, 1, 0)

@nrainer-materialize I think we should just make it a class variable instead of a method, but can be done independent of this PR, unless you had a good reason for this design.

I ended up splittin out the TPC-H stuff into a new benchmark, so I think I don't need to update the version anymore, right?

@jkosh44: That's right.

def-

Fine now, no version upgrade needed

def- · 2024-10-04T18:24:17Z

misc/python/materialize/feature_benchmark/scenarios/benchmark_main.py

+
+        create_tpch_tables = "\n".join(
+            f"""
+> {newline.sub(" ", comment.sub("", ddl))}


Ouch, then makes sense to do it this way.

jkosh44 · 2024-10-07T13:26:07Z

@def- FYI, according to the Nightly run of this PR, there's a regression in the StartupTpch workload, https://buildkite.com/materialize/nightly/builds/9824#0192589f-e35b-4f92-b798-63fc890e702d. I don't think that makes much sense because this PR is what added the StartupTpch workload. There may be something weird going on there.

def- · 2024-10-07T13:33:26Z

It does make sense to me. The way that we run feature-benchmark to check for regressions uses the new state of test code to compare the old Materialize product against the new Materialize product. But we never use the old state of test code.

jkosh44 · 2024-10-07T13:42:12Z

So basically, when comparing 6bfdb4e against this branch, there actually is a regression in startup times (accoring to StartupTpch)?

def- · 2024-10-07T13:48:12Z

Yes, assuming the scenario works reliably. Since there were no product changes, maybe the scenario is too flaky to be useful.

jkosh44 · 2024-10-07T13:55:53Z

Since there were no product changes

How did you reach that conclusion? Was 6bfdb4e the commit directly before this one?

def- · 2024-10-07T14:06:11Z

Yes, in PRs we always compare against the last state that was in main.

jkosh44 · 2024-10-07T14:09:55Z

Ah ok, I'm curious to see how it looks after a couple of days and if it's always flaky then I'm OK removing it.

nrainer-materialize · 2024-10-07T14:14:47Z

The build's error annotations show the comparison target.

For example:

New regression against v0.120.0-dev.0--main.g6bfdb4e0f9ab5476ca638fcf37827588cc40dbe9

jkosh44 force-pushed the startup-benchmark branch from 74050fa to 028f3a4 Compare October 3, 2024 15:37

jkosh44 changed the title ~~WIP improve startup benchmark~~ benchmark: Update startup benchmarks Oct 3, 2024

jkosh44 force-pushed the startup-benchmark branch 7 times, most recently from 6f99f25 to 9d5ad9d Compare October 4, 2024 15:36

jkosh44 force-pushed the startup-benchmark branch from 9d5ad9d to c1336a8 Compare October 4, 2024 15:38

jkosh44 commented Oct 4, 2024

View reviewed changes

jkosh44 requested a review from a team October 4, 2024 15:40

jkosh44 marked this pull request as ready for review October 4, 2024 15:41

def- requested changes Oct 4, 2024

View reviewed changes

jkosh44 added 2 commits October 4, 2024 13:37

Fixup

50ff5ec

Fixup

f9831d7

def- approved these changes Oct 4, 2024

View reviewed changes

jkosh44 merged commit 010bfbf into MaterializeInc:main Oct 4, 2024
15 of 18 checks passed

jkosh44 deleted the startup-benchmark branch October 4, 2024 20:16

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

benchmark: Update startup benchmarks #29767

benchmark: Update startup benchmarks #29767

jkosh44 commented Sep 27, 2024 •

edited

Loading

jkosh44 Oct 4, 2024

def- Oct 4, 2024

jkosh44 Oct 4, 2024

def- Oct 4, 2024

def- left a comment

def- Oct 4, 2024

jkosh44 Oct 4, 2024

nrainer-materialize Oct 7, 2024

def- left a comment

def- Oct 4, 2024

jkosh44 commented Oct 7, 2024

def- commented Oct 7, 2024

jkosh44 commented Oct 7, 2024

def- commented Oct 7, 2024 •

edited

Loading

jkosh44 commented Oct 7, 2024

def- commented Oct 7, 2024

jkosh44 commented Oct 7, 2024

nrainer-materialize commented Oct 7, 2024

		@@ -1833,7 +1835,7 @@ def benchmark(self) -> BenchmarkingSequence:
		class StartupLoaded(Scenario):

benchmark: Update startup benchmarks #29767

benchmark: Update startup benchmarks #29767

Conversation

jkosh44 commented Sep 27, 2024 • edited Loading

Motivation

Checklist

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

def- left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

def- left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jkosh44 commented Oct 7, 2024

def- commented Oct 7, 2024

jkosh44 commented Oct 7, 2024

def- commented Oct 7, 2024 • edited Loading

jkosh44 commented Oct 7, 2024

def- commented Oct 7, 2024

jkosh44 commented Oct 7, 2024

nrainer-materialize commented Oct 7, 2024

jkosh44 commented Sep 27, 2024 •

edited

Loading

def- commented Oct 7, 2024 •

edited

Loading