-
Notifications
You must be signed in to change notification settings - Fork 468
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
benchmark: Update startup benchmarks #29767
Conversation
74050fa
to
028f3a4
Compare
6f99f25
to
9d5ad9d
Compare
This commit updates the `StartupLoaded` feature benchmarks to create a greater number of complex objects. We've noticed a significant amount of time during startup is spent in optimization, so we want to make sure that this benchmark captures that. Works towards resolving #MaterializeInc/database-issues/8384
9d5ad9d
to
c1336a8
Compare
|
||
create_tpch_tables = "\n".join( | ||
f""" | ||
> {newline.sub(" ", comment.sub("", ddl))} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It took me a long time to come up with the right regex magic for TD to accept these queries. If there's a better way then please let me know.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally testdrive is not great for programmatically generating, I might have just pasted the queries in directly (if they are not too huge).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's roughly 1000 lines of SQL
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ouch, then makes sense to do it this way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this makes the scenario much more interesting.
You could argue that we might want to keep the old version and add this as a new one, but I'm ok with not having too many scenarios, so fine by me.
@@ -1833,7 +1835,7 @@ def benchmark(self) -> BenchmarkingSequence: | |||
class StartupLoaded(Scenario): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should bump the version of this scenario because it's no longer comparable to old results:
def version(self) -> ScenarioVersion:
return ScenarioVersion.create(1, 1, 0)
@nrainer-materialize I think we should just make it a class variable instead of a method, but can be done independent of this PR, unless you had a good reason for this design.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I ended up splittin out the TPC-H stuff into a new benchmark, so I think I don't need to update the version anymore, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jkosh44: That's right.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fine now, no version upgrade needed
|
||
create_tpch_tables = "\n".join( | ||
f""" | ||
> {newline.sub(" ", comment.sub("", ddl))} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ouch, then makes sense to do it this way.
@def- FYI, according to the Nightly run of this PR, there's a regression in the StartupTpch workload, https://buildkite.com/materialize/nightly/builds/9824#0192589f-e35b-4f92-b798-63fc890e702d. I don't think that makes much sense because this PR is what added the StartupTpch workload. There may be something weird going on there. |
It does make sense to me. The way that we run feature-benchmark to check for regressions uses the new state of test code to compare the old Materialize product against the new Materialize product. But we never use the old state of test code. |
So basically, when comparing 6bfdb4e against this branch, there actually is a regression in startup times (accoring to |
Yes, assuming the scenario works reliably. Since there were no product changes, maybe the scenario is too flaky to be useful. |
How did you reach that conclusion? Was 6bfdb4e the commit directly before this one? |
Yes, in PRs we always compare against the last state that was in main. |
Ah ok, I'm curious to see how it looks after a couple of days and if it's always flaky then I'm OK removing it. |
The build's error annotations show the comparison target. For example:
|
This commit updates the
StartupLoaded
feature benchmarks to createa greater number of complex objects. We've noticed a significant amount
of time during startup is spent in optimization, so we want to make
sure that this benchmark captures that.
Works towards resolving #MaterializeInc/database-issues/8384
Motivation
Improves test
Checklist
$T ⇔ Proto$T
mapping (possibly in a backwards-incompatible way), then it is tagged with aT-proto
label.