Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
81 changes: 81 additions & 0 deletions content/en/experiments/connecting_bigquery.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,81 @@
---
title: Connect BigQuery for Warehouse Native Experiment Analysis
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lukasgoetzweiss The Snowflake docs (PR: #35422 (review)) use hyphenated "Warehouse-Native" but this does not. What should we do?

description: Connect a BigQuery service account to enable warehouse native experiment analysis.
private: true
further_reading:
- link: "/experiments/defining_metrics"
tag: "Documentation"
text: "Defining metrics in Datadog Experiments"
- link: "https://www.datadoghq.com/blog/experimental-data-datadog/"
tag: "Blog"
text: "How to bridge speed and quality in experiments through unified data"
---

## Overview

This guide walks through connecting BigQuery to Datadog to enable warehouse-native experiment analysis in four steps: connecting a Google Cloud Platform (GCP) service account, creating resources in GCP, granting permissions to the service account, and configuring experiment-specific settings in Datadog.

Check warning on line 16 in content/en/experiments/connecting_bigquery.md

View workflow job for this annotation

GitHub Actions / vale

Datadog.words_case_sensitive

Use 'Google Cloud' instead of 'Google Cloud Platform'.

Check notice on line 16 in content/en/experiments/connecting_bigquery.md

View workflow job for this annotation

GitHub Actions / vale

Datadog.sentencelength

Suggestion: Try to keep your sentence length to 25 words or fewer.

## Step 1: Connect a Google Cloud service account

Datadog connects to BigQuery using a service account created for Datadog. If you have already connected BigQuery to Datadog you can continue to use that service account for Datadog Experiments. Otherwise, see the [Google Cloud Platform integration page][1] to create a new service account.

Check warning on line 20 in content/en/experiments/connecting_bigquery.md

View workflow job for this annotation

GitHub Actions / vale

Datadog.words_case_insensitive

Use 'create a' or 'create an' instead of 'create a new'.

Check warning on line 20 in content/en/experiments/connecting_bigquery.md

View workflow job for this annotation

GitHub Actions / vale

Datadog.words_case_sensitive

Use 'Google Cloud' instead of 'Google Cloud Platform'.

Once you have created a service account, continue on to the next section.

Check warning on line 22 in content/en/experiments/connecting_bigquery.md

View workflow job for this annotation

GitHub Actions / vale

Datadog.words_case_insensitive

Use 'after you' instead of 'Once you'.

Check warning on line 22 in content/en/experiments/connecting_bigquery.md

View workflow job for this annotation

GitHub Actions / vale

Datadog.words_case_sensitive

Use 'After' instead of 'Once'.

<div class="alert alert-info">If you're only using the Google Cloud integration for warehouse native experiment analysis, you can opt out of collecting other resources.</div>
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Jeff flagged that "opt out" was confusing because there are toggles in the UI; he suggested using clearer language here like "disable" or "toggle off".


## Step 2: Create Google Cloud resources

Datadog Experiments requires a Google Cloud Storage bucket to stage experiment exposure records and a BigQuery dataset to cache intermediate experiment results. Follow the steps below to create these resources.

1. In the Google Cloud Console, navigate to **BigQuery**.
1. Click on your project, then click **Create Dataset**.
1. Enter a dataset ID (e.g., `datadog_experiments_output`), select a data location, and click **Create Dataset**.

Check warning on line 32 in content/en/experiments/connecting_bigquery.md

View workflow job for this annotation

GitHub Actions / vale

Datadog.abbreviations_latin

Use 'for example' instead of abbreviations like 'e.g.,'.
1. Follow [Google's documentation][2] to create a new bucket for Datadog to stage experiment exposure records.

Check warning on line 33 in content/en/experiments/connecting_bigquery.md

View workflow job for this annotation

GitHub Actions / vale

Datadog.words_case_insensitive

Use 'create a' or 'create an' instead of 'create a new'.

## Step 3: Grant IAM roles to the service account

In addition to the permissions described in the [Google Cloud Platform integration page][1], the Datadog Experiments service account requires the following permissions:

Check warning on line 37 in content/en/experiments/connecting_bigquery.md

View workflow job for this annotation

GitHub Actions / vale

Datadog.words_case_sensitive

Use 'Google Cloud' instead of 'Google Cloud Platform'.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@chasdevs does this sentence contradict with line 24: "If you're only using the Google Cloud integration for warehouse native experiment analysis, you can opt out of collecting other resources."

I guess we're recommending to still give all of the permissions, even if we're not using them to collect other resources?

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we just say "Datadog Experiments requires some additional permissions for the service account you are using..."?


1. [BigQuery Job User][4] — allows the service account to run BigQuery jobs.

Check warning on line 39 in content/en/experiments/connecting_bigquery.md

View workflow job for this annotation

GitHub Actions / vale

Datadog.dashes

Don't put a space before or after a dash.
1. [BigQuery Data Owner][5] - grants the service account full access to the Datadog Experiments output dataset.
1. [Storage Object User][6] - allows the service account to read and write objects in the storage bucket used by Datadog Experiment.
1. [BigQuery Data Viewer][7] - allows the service account to read table used in warehouse native metrics.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where's this coming from? I think if we have BigQuery Data Owner you don't need this.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding is that you need Data Owner on the output dataset, and data viewer on datasets you intend to use when creating metric SQL models. Am I thinking about that wrong?


To assign these roles at the project level:

1. Navigate to **IAM & Admin** > **IAM** in the Google Cloud Console.
2. Click **Grant Access**.
3. Enter the service account email in the **New principals** field.
4. Add the roles listed above, then click **Save**.

To grant read access to specific source tables, follow the steps below:

1. Navigate to **BigQuery** in the Google Cloud Console.
1. Select the dataset containing your source tables.
1. Click **Sharing** > **Permissions**.
1. Click **Add Principal**, enter the service account email, and assign the **BigQuery Data Viewer** role.
1. Repeat for each dataset that is needed for building experiment metrics.

## Step 4: Configure experiment settings

Once your BigQuery service account is connected to Datadog, navigate to the [Experiment Warehouse Connection][8] page and click **Connect a data warehouse** to configure experiment settings.

Check warning on line 61 in content/en/experiments/connecting_bigquery.md

View workflow job for this annotation

GitHub Actions / vale

Datadog.words_case_sensitive

Use 'After' instead of 'Once'.

Check notice on line 61 in content/en/experiments/connecting_bigquery.md

View workflow job for this annotation

GitHub Actions / vale

Datadog.sentencelength

Suggestion: Try to keep your sentence length to 25 words or fewer.

Select the appropriate service account and project as well as the dataset and Google Cloud Storage bucket created in step 2. Click **Save** to finish the setup.

{{< img src="/product_analytics/experiment/guide/bigquery_experiment_setup.png" alt="The Edit Data Warehouse modal with BigQuery selected, showing two sections: Select BigQuery Account with fields for GCP Service Account and Project, and Dataset and GCS Bucket with fields for Dataset and GCS Bucket." style="width:90%;" >}}

After you save your warehouse connection, create experiment metrics using your BigQuery data. See [Create Experiment Metrics][9].

## Further reading

{{< partial name="whats-next/whats-next.html" >}}

[1]: /integrations/google-cloud-platform/
[2]: https://docs.cloud.google.com/storage/docs/creating-buckets#console
[4]: https://docs.cloud.google.com/iam/docs/roles-permissions/bigquery#bigquery.jobUser
[5]: https://docs.cloud.google.com/iam/docs/roles-permissions/bigquery#bigquery.dataOwner
[6]: https://docs.cloud.google.com/iam/docs/roles-permissions/storage#storage.objectUser
[7]: https://docs.cloud.google.com/iam/docs/roles-permissions/bigquery#bigquery.dataViewer
[8]: https://app.datadoghq.com/product-analytics/experiments/settings/warehouse-connections
[9]: /experiments/defining_metrics

Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading