-
Notifications
You must be signed in to change notification settings - Fork 1.3k
[DOCS-13743] EXP- GA BigQuery guide #35455
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,81 @@ | ||
| --- | ||
| title: Connect BigQuery for Warehouse Native Experiment Analysis | ||
| description: Connect a BigQuery service account to enable warehouse native experiment analysis. | ||
| private: true | ||
| further_reading: | ||
| - link: "/experiments/defining_metrics" | ||
| tag: "Documentation" | ||
| text: "Defining metrics in Datadog Experiments" | ||
| - link: "https://www.datadoghq.com/blog/experimental-data-datadog/" | ||
| tag: "Blog" | ||
| text: "How to bridge speed and quality in experiments through unified data" | ||
| --- | ||
|
|
||
| ## Overview | ||
|
|
||
| This guide walks through connecting BigQuery to Datadog to enable warehouse-native experiment analysis in four steps: connecting a Google Cloud Platform (GCP) service account, creating resources in GCP, granting permissions to the service account, and configuring experiment-specific settings in Datadog. | ||
|
Check warning on line 16 in content/en/experiments/connecting_bigquery.md
|
||
|
|
||
| ## Step 1: Connect a Google Cloud service account | ||
|
|
||
| Datadog connects to BigQuery using a service account created for Datadog. If you have already connected BigQuery to Datadog you can continue to use that service account for Datadog Experiments. Otherwise, see the [Google Cloud Platform integration page][1] to create a new service account. | ||
|
Check warning on line 20 in content/en/experiments/connecting_bigquery.md
|
||
|
|
||
| Once you have created a service account, continue on to the next section. | ||
|
Check warning on line 22 in content/en/experiments/connecting_bigquery.md
|
||
|
|
||
| <div class="alert alert-info">If you're only using the Google Cloud integration for warehouse native experiment analysis, you can opt out of collecting other resources.</div> | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Nit: Jeff flagged that "opt out" was confusing because there are toggles in the UI; he suggested using clearer language here like "disable" or "toggle off". |
||
|
|
||
| ## Step 2: Create Google Cloud resources | ||
|
|
||
| Datadog Experiments requires a Google Cloud Storage bucket to stage experiment exposure records and a BigQuery dataset to cache intermediate experiment results. Follow the steps below to create these resources. | ||
|
|
||
| 1. In the Google Cloud Console, navigate to **BigQuery**. | ||
| 1. Click on your project, then click **Create Dataset**. | ||
| 1. Enter a dataset ID (e.g., `datadog_experiments_output`), select a data location, and click **Create Dataset**. | ||
| 1. Follow [Google's documentation][2] to create a new bucket for Datadog to stage experiment exposure records. | ||
|
|
||
| ## Step 3: Grant IAM roles to the service account | ||
|
|
||
| In addition to the permissions described in the [Google Cloud Platform integration page][1], the Datadog Experiments service account requires the following permissions: | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @chasdevs does this sentence contradict with line 24: "If you're only using the Google Cloud integration for warehouse native experiment analysis, you can opt out of collecting other resources." I guess we're recommending to still give all of the permissions, even if we're not using them to collect other resources? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can we just say "Datadog Experiments requires some additional permissions for the service account you are using..."? |
||
|
|
||
| 1. [BigQuery Job User][4] — allows the service account to run BigQuery jobs. | ||
| 1. [BigQuery Data Owner][5] - grants the service account full access to the Datadog Experiments output dataset. | ||
| 1. [Storage Object User][6] - allows the service account to read and write objects in the storage bucket used by Datadog Experiment. | ||
| 1. [BigQuery Data Viewer][7] - allows the service account to read table used in warehouse native metrics. | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Where's this coming from? I think if we have BigQuery Data Owner you don't need this.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. My understanding is that you need Data Owner on the output dataset, and data viewer on datasets you intend to use when creating metric SQL models. Am I thinking about that wrong? |
||
|
|
||
| To assign these roles at the project level: | ||
|
|
||
| 1. Navigate to **IAM & Admin** > **IAM** in the Google Cloud Console. | ||
| 2. Click **Grant Access**. | ||
| 3. Enter the service account email in the **New principals** field. | ||
| 4. Add the roles listed above, then click **Save**. | ||
|
|
||
| To grant read access to specific source tables, follow the steps below: | ||
|
|
||
| 1. Navigate to **BigQuery** in the Google Cloud Console. | ||
| 1. Select the dataset containing your source tables. | ||
| 1. Click **Sharing** > **Permissions**. | ||
| 1. Click **Add Principal**, enter the service account email, and assign the **BigQuery Data Viewer** role. | ||
| 1. Repeat for each dataset that is needed for building experiment metrics. | ||
|
|
||
| ## Step 4: Configure experiment settings | ||
|
|
||
| Once your BigQuery service account is connected to Datadog, navigate to the [Experiment Warehouse Connection][8] page and click **Connect a data warehouse** to configure experiment settings. | ||
|
Check warning on line 61 in content/en/experiments/connecting_bigquery.md
|
||
|
|
||
| Select the appropriate service account and project as well as the dataset and Google Cloud Storage bucket created in step 2. Click **Save** to finish the setup. | ||
|
|
||
| {{< img src="/product_analytics/experiment/guide/bigquery_experiment_setup.png" alt="The Edit Data Warehouse modal with BigQuery selected, showing two sections: Select BigQuery Account with fields for GCP Service Account and Project, and Dataset and GCS Bucket with fields for Dataset and GCS Bucket." style="width:90%;" >}} | ||
|
|
||
| After you save your warehouse connection, create experiment metrics using your BigQuery data. See [Create Experiment Metrics][9]. | ||
|
|
||
| ## Further reading | ||
|
|
||
| {{< partial name="whats-next/whats-next.html" >}} | ||
|
|
||
| [1]: /integrations/google-cloud-platform/ | ||
| [2]: https://docs.cloud.google.com/storage/docs/creating-buckets#console | ||
| [4]: https://docs.cloud.google.com/iam/docs/roles-permissions/bigquery#bigquery.jobUser | ||
| [5]: https://docs.cloud.google.com/iam/docs/roles-permissions/bigquery#bigquery.dataOwner | ||
| [6]: https://docs.cloud.google.com/iam/docs/roles-permissions/storage#storage.objectUser | ||
| [7]: https://docs.cloud.google.com/iam/docs/roles-permissions/bigquery#bigquery.dataViewer | ||
| [8]: https://app.datadoghq.com/product-analytics/experiments/settings/warehouse-connections | ||
| [9]: /experiments/defining_metrics | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lukasgoetzweiss The Snowflake docs (PR: #35422 (review)) use hyphenated "Warehouse-Native" but this does not. What should we do?