Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add progress bars for main phases #109

Merged
merged 3 commits into from
Jun 18, 2024
Merged

Add progress bars for main phases #109

merged 3 commits into from
Jun 18, 2024

Conversation

daverigby
Copy link
Collaborator

@daverigby daverigby commented Jun 14, 2024

Add progress bars for each of the main phases of an experiment -
Setup, Populate and Run. For Populate and Run we require the total
number of records / queries to show a useful end progress so far.

For the Run phase we include the current latency and recall(*) values,
these require additional metrics of how many records have been
upserted so far (note that we generally upsert in batches, so the
existing number of Population requests is not sufficient.

(*) For recall we cannot calculate the current metric (last 10s), as
we only have a single histogram accumulating the results - instead
this shows the overall recall so far. There's an improvement
raised to fix this (#108).

Fixes #44.

Use vsb.logger instead of the global 'logging' methods, as we only
show messages from the vsb logger to stdout (everything still goes to
the vsg.log file).
Add a new VectorWorkload.request_count property to all workloads,
which specified how many requests are in the query dataset. This will
be used to show a progress bar for the Run phase - the MasterRunner
needs to know the total number of requests for the progress bar's
denominator.
@daverigby daverigby force-pushed the progressbars branch 2 times, most recently from 7490de6 to 5691131 Compare June 17, 2024 14:11
Add progress bars for each of the main phases of an experiment -
Setup, Populate and Run. For Populate and Run we require the total
number of records / queries to show a useful end progress so far.

For the Populate phase we include the current rate of upsert. For the
Run phase we include the current latency and recall(*) values, these
require additional metrics of how many records have been upserted so
far (note that we generally upsert in batches, so the existing number
of Population requests is not sufficient.

(*) For recall we cannot calculate the current metric (last 10s), as
    we only have a single histogram accumulating the results - instead
    this shows the overall recall so far. There's an improvement
    raised to fix this (#108).
@daverigby daverigby merged commit cf705b9 into main Jun 18, 2024
4 checks passed
@daverigby daverigby deleted the progressbars branch June 18, 2024 08:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Re-add progress bar for downloading
1 participant