Skip to content

Latest commit

 

History

History
84 lines (54 loc) · 4.84 KB

README.md

File metadata and controls

84 lines (54 loc) · 4.84 KB

Dashboard

This is an example dashboard to display the data that is collected by the metrics handler.

This guide assumes you have a Google Cloud Platform project and you are running the metrics handler to collect data in BigQuery.

Running locally

export JOB_HISTORY_TABLE_NAME='your-project-name.metrics_handler_dataset.job_history'
export METRIC_HISTORY_TABLE_NAME='your-project-name.metrics_handler_dataset.metric_history'
<optional, if you want to to split tests into multiple tabs or exclude tests>
export TEST_NAME_PREFIXES='prefix1,prefix2'

python3 -m bokeh serve --show dashboard/dashboard.py dashboard/metrics.py dashboard/compare.py

The arguments to bokeh serve --show are paths to dashboard Python files. The command above assumes you are running the command from ml-testing-accelerators/

Hosting your dashboard

You can host your dashboard using App Engine.

Query caching with redis is strongly recommended but not strictly required - you can ignore the redis steps and the dashboard will work but will log warnings about failing to connect to the cache.

  1. Set up env vars
export REGION=us-west2
export PROJECT_ID=my-project
export INSTANCE_NAME=my-redis-instance
  1. gcloud redis instances create $INSTANCE_NAME --size=2 --region=$REGION --redis-version=redis_4_0 --project=$PROJECT_ID

  2. Edit app.yaml in 3 ways:

  • Update redis info if using redis:
    • REDISHOST = Find this value using: echo $(gcloud redis instances describe $INSTANCE_NAME --region=$REGION --project=$PROJECT_ID --format='value(host)')
    • REDISPORT = Find this value using: echo $(gcloud redis instances describe $INSTANCE_NAME --region=$REGION --project=$PROJECT_ID --format='value(port)')
  • Update JOB_HISTORY_TABLE_NAME and METRIC_HISTORY_TABLE_NAME.
    • You can find these table names here by clicking your project name in the left sidebar.
  • Change --allow-websocket-origin arg in entrypoint to be the URL of your app engine project. You can find this URL in your App Engine Dashboard (top right of that UI). If you don’t see the URL there, try deploying first (see next step) and at that point you should receive your URL.
  1. Make sure you are in the dir where app.yaml lives and then run gcloud app deploy

Clean up your hosted dashboard

  1. gcloud redis instances delete $INSTANCE_NAME --region=$REGION --project=$PROJECT_ID

  2. Click "Disable Application" on this page

Advanced Usage

Multiple dashboard versions

When you run gcloud app deploy, the command defaults to using app.yaml. You can specify a different yaml with e.g. gcloud app deploy custom.yaml or deploy multiple versions with gcloud app deploy custom.yaml custom2.yaml.

Check out pytorch-dashboard.yaml for an example of a secondary dashboard and note the allow-websocket-origin argument for an example of how the URL looks for the secondary dashboard(s).

Privacy / authentication

You can restrict which tests are shown in the dashboard by using the TEST_NAME_PREFIXES environment variable in the .yaml file. If this variable is set, only tests with the supplied prefixes will be shown and each prefix will appear in its own tab. If this variable is not set, all tests will be shown and will not be divided into tabs.

The hosted dashboard runs as an App Engine app. You can restrict the URL to specific group(s) of people by following the instructions here. You can configure the IAP (identity-aware proxy) rules such that each dashboard version has its own group of restricted viewers. See also the Multiple dashboard versions section above.

Pre-generated URLs for the compare dashboard

The compare.py dashboard allows you to specify a list of test names and a list of metric names and renders 1 graph+table for each combination of test and metric.

Since it's tedious to type the same list of test and metric names for a commonly-used query, you can generate a URL with the test and metric names encoded in it.

Note that you can use the SQL wildcard % to use prefix or suffix matching.

The URL should be of the form $APP_URL/compare?test_names=$B64_TEST_NAMES&metric_names=$B64_METRIC_NAMES. For example, if your APP_URL is 'https://xl-ml-test.appspot.com', then a simple snippet to generate the URL could be:

python3
 > import base64
 > my_url = 'https://xl-ml-test.appspot.com'
 > b64_test_names = base64.b64encode('tf-nightly-%,my-other-test1,my-other-test-2'.encode('utf-8')).decode()
 > b64_metric_names = base64.b64encode('%examples/sec%,accuracy_final'.encode('utf-8')).decode()
 > print(f'{my_url}/compare?test_names={b64_test_names}&metric_names={b64_metric_names}')

(note the syntax would be a little different in python2)