Skip to content

Commit

Permalink
Add docs for setting up cloud provider provider section (#1816)
Browse files Browse the repository at this point in the history
* simplify scrape job examples a bit

* generate docs with examples

* added pre docs

* revert changes

* Apply suggestions from code review

Co-authored-by: Tristan <[email protected]>

* tell user to use previous token

* added sub sections

* more pr corrections

---------

Co-authored-by: Tristan <[email protected]>
  • Loading branch information
thepalbi and tristanburgess authored Sep 30, 2024
1 parent aee0d97 commit 9f711d1
Show file tree
Hide file tree
Showing 5 changed files with 242 additions and 160 deletions.
3 changes: 3 additions & 0 deletions GNUmakefile
Original file line number Diff line number Diff line change
Expand Up @@ -67,5 +67,8 @@ golangci-lint:
--workdir "/src" \
golangci/golangci-lint:v1.54 golangci-lint run ./... -v

docs:
go generate ./...

linkcheck:
docker run --rm --entrypoint sh -v "$$PWD:$$PWD" -w "$$PWD" python:3.11-alpine -c "pip3 install linkchecker && linkchecker --config .linkcheckerrc docs"
134 changes: 134 additions & 0 deletions docs/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -226,6 +226,135 @@ resource "grafana_oncall_escalation" "example_notify_step" {
- `tls_key` (String) Client TLS key (file path or literal value) to use to authenticate to the Grafana server. May alternatively be set via the `GRAFANA_TLS_KEY` environment variable.
- `url` (String) The root URL of a Grafana server. May alternatively be set via the `GRAFANA_URL` environment variable.

### Managing Cloud Provider

#### Obtaining Cloud Provider access token

Before using the Terraform Provider to manage Grafana Cloud Provider Observability resources, such as AWS CloudWatch scrape jobs, you need to create an access policy token on the Grafana Cloud Portal. This token is used to authenticate the provider to the Grafana Cloud Provider API.
[These docs](https://grafana.com/docs/grafana-cloud/account-management/authentication-and-permissions/access-policies/authorize-services/#create-an-access-policy-for-a-stack) will guide you on how to create
an access policy. The required permissions, or scopes, are `integration-management:read`, `integration-management:write` and `stacks:read`.

Also, by default the Access Policies UI will not show those scopes, to find name you need to use the `Add Scope` textbox, as shown in the following image:

<img src="https://grafana.com/media/docs/grafana-cloud/aws/cloud-provider-terraform-access-policy-creation.png" width="700"/>

1. Use the `Add Scope` textbox (1) to search for the permissions you need to add to the access policy.
1. Make sure that you configure the three required scopes. Once done, you'll see the selected scopes as in (2).

Having created an Access Policy, you can now create a token that will be used to authenticate the provider to the Cloud Provider API. You can do so just after creating the access policy, following
the in-screen instructions, of following [this guide](https://grafana.com/docs/grafana-cloud/account-management/authentication-and-permissions/access-policies/authorize-services/#create-one-or-more-access-policy-tokens).

#### Obtaining Cloud Provider API hostname

Having created the token, we can find the correct Cloud Provider API hostname by running the following script, that requires `curl` and [`jq`](https://jqlang.github.io/jq/) installed:

```bash
curl -sH "Authorization: Bearer <Access Token from previous step>" "https://grafana.com/api/instances" | \
jq '[.items[]|{stackName: .slug, clusterName:.clusterSlug, cloudProviderAPIURL: "https://cloud-provider-api-\(.clusterSlug).grafana.net"}]'
```

This script will return a list of all the Grafana Cloud stacks you own, with the Cloud Provider API hostname for each one. Choose the correct hostname for the stack you want to manage.
For example, in the following response, the correct hostname for the `herokublogpost` stack is `https://cloud-provider-api-prod-us-central-0.grafana.net`.

```
[
{
"stackName": "herokublogpost",
"clusterName": "prod-us-central-0",
"cloudProviderAPIURL": "https://cloud-provider-api-prod-us-central-0.grafana.net"
}
]
```

#### Configuring Provider

Once you have the token and Cloud Provider API hostanme, you can configure the provider as follows:

```hcl
provider "grafana" {
// ...
cloud_provider_url = <Cloud Provider API URL from previous step>
cloud_provider_access_token = <Access Token from previous step>
}
```

The following are examples on how the *Account* and *Scrape Job* resources can be configured:

```terraform
data "grafana_cloud_stack" "test" {
slug = "gcloudstacktest"
}
data "aws_iam_role" "test" {
name = "my-role"
}
resource "grafana_cloud_provider_aws_account" "test" {
stack_id = data.grafana_cloud_stack.test.id
role_arn = data.aws_iam_role.test.arn
regions = [
"us-east-1",
"us-east-2",
"us-west-1"
]
}
```

```terraform
data "grafana_cloud_stack" "test" {
slug = "gcloudstacktest"
}
data "aws_iam_role" "test" {
name = "my-role"
}
resource "grafana_cloud_provider_aws_account" "test" {
stack_id = data.grafana_cloud_stack.test.id
role_arn = data.aws_iam_role.test.arn
regions = [
"us-east-1",
"us-east-2",
"us-west-1"
]
}
resource "grafana_cloud_provider_aws_cloudwatch_scrape_job" "test" {
stack_id = data.grafana_cloud_stack.test.id
name = "my-cloudwatch-scrape-job"
aws_account_resource_id = grafana_cloud_provider_aws_account.test.resource_id
regions = grafana_cloud_provider_aws_account.test.regions
export_tags = true
service {
name = "AWS/EC2"
metric {
name = "CPUUtilization"
statistics = ["Average"]
}
metric {
name = "StatusCheckFailed"
statistics = ["Maximum"]
}
scrape_interval_seconds = 300
resource_discovery_tag_filter {
key = "k8s.io/cluster-autoscaler/enabled"
value = "true"
}
tags_to_add_to_metrics = ["eks:cluster-name"]
}
custom_namespace {
name = "CoolApp"
metric {
name = "CoolMetric"
statistics = ["Maximum", "Sum"]
}
scrape_interval_seconds = 300
}
}
```

## Authentication

One, or many, of the following authentication settings must be set. Each authentication setting allows a subset of resources to be used
Expand All @@ -248,3 +377,8 @@ You can use the `grafana_synthetic_monitoring_installation` resource as shown ab

[Grafana OnCall](https://grafana.com/docs/oncall/latest/oncall-api-reference/)
uses API keys to allow access to the API. You can request a new OnCall API key in OnCall -> Settings page.

### `cloud_provider_access_token`

An access policy token created to manage [Grafana Cloud Provider Observability](https://grafana.com/docs/grafana-cloud/monitor-infrastructure/monitor-cloud-provider/).
To create one, follow the instructions in the [manging cloud provider section](##obtaining-cloud-provider-access-token).
101 changes: 21 additions & 80 deletions docs/resources/cloud_provider_aws_cloudwatch_scrape_job.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,97 +31,38 @@ resource "grafana_cloud_provider_aws_account" "test" {
]
}
locals {
services = [
{
name = "AWS/EC2",
metrics = [
{
name = "CPUUtilization",
statistics = [
"Average",
],
},
{
name = "StatusCheckFailed",
statistics = [
"Maximum",
],
},
],
scrape_interval_seconds = 300,
resource_discovery_tag_filters = [
{
key = "k8s.io/cluster-autoscaler/enabled",
value = "true",
}
],
tags_to_add_to_metrics = [
"eks:cluster-name",
]
},
]
custom_namespaces = [
{
name = "CoolApp",
metrics = [
{
name = "CoolMetric",
statistics = [
"Maximum",
"Sum",
]
},
],
scrape_interval_seconds = 300,
},
]
}
resource "grafana_cloud_provider_aws_cloudwatch_scrape_job" "test" {
stack_id = data.grafana_cloud_stack.test.id
name = "my-cloudwatch-scrape-job"
aws_account_resource_id = grafana_cloud_provider_aws_account.test.resource_id
regions = grafana_cloud_provider_aws_account.test.regions
export_tags = true
dynamic "service" {
for_each = local.services
content {
name = service.value.name
dynamic "metric" {
for_each = service.value.metrics
content {
name = metric.value.name
statistics = metric.value.statistics
}
}
scrape_interval_seconds = service.value.scrape_interval_seconds
dynamic "resource_discovery_tag_filter" {
for_each = service.value.resource_discovery_tag_filters
content {
key = resource_discovery_tag_filter.value.key
value = resource_discovery_tag_filter.value.value
}
}
tags_to_add_to_metrics = service.value.tags_to_add_to_metrics
service {
name = "AWS/EC2"
metric {
name = "CPUUtilization"
statistics = ["Average"]
}
metric {
name = "StatusCheckFailed"
statistics = ["Maximum"]
}
scrape_interval_seconds = 300
resource_discovery_tag_filter {
key = "k8s.io/cluster-autoscaler/enabled"
value = "true"
}
tags_to_add_to_metrics = ["eks:cluster-name"]
}
dynamic "custom_namespace" {
for_each = local.custom_namespaces
content {
name = custom_namespace.value.name
dynamic "metric" {
for_each = custom_namespace.value.metrics
content {
name = metric.value.name
statistics = metric.value.statistics
}
}
scrape_interval_seconds = custom_namespace.value.scrape_interval_seconds
custom_namespace {
name = "CoolApp"
metric {
name = "CoolMetric"
statistics = ["Maximum", "Sum"]
}
scrape_interval_seconds = 300
}
}
```
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -16,96 +16,37 @@ resource "grafana_cloud_provider_aws_account" "test" {
]
}

locals {
services = [
{
name = "AWS/EC2",
metrics = [
{
name = "CPUUtilization",
statistics = [
"Average",
],
},
{
name = "StatusCheckFailed",
statistics = [
"Maximum",
],
},
],
scrape_interval_seconds = 300,
resource_discovery_tag_filters = [
{
key = "k8s.io/cluster-autoscaler/enabled",
value = "true",
}
],
tags_to_add_to_metrics = [
"eks:cluster-name",
]
},
]
custom_namespaces = [
{
name = "CoolApp",
metrics = [
{
name = "CoolMetric",
statistics = [
"Maximum",
"Sum",
]
},
],
scrape_interval_seconds = 300,
},
]
}

resource "grafana_cloud_provider_aws_cloudwatch_scrape_job" "test" {
stack_id = data.grafana_cloud_stack.test.id
name = "my-cloudwatch-scrape-job"
aws_account_resource_id = grafana_cloud_provider_aws_account.test.resource_id
regions = grafana_cloud_provider_aws_account.test.regions
export_tags = true

dynamic "service" {
for_each = local.services
content {
name = service.value.name
dynamic "metric" {
for_each = service.value.metrics
content {
name = metric.value.name
statistics = metric.value.statistics
}
}
scrape_interval_seconds = service.value.scrape_interval_seconds
dynamic "resource_discovery_tag_filter" {
for_each = service.value.resource_discovery_tag_filters
content {
key = resource_discovery_tag_filter.value.key
value = resource_discovery_tag_filter.value.value
}

}
tags_to_add_to_metrics = service.value.tags_to_add_to_metrics
service {
name = "AWS/EC2"
metric {
name = "CPUUtilization"
statistics = ["Average"]
}
metric {
name = "StatusCheckFailed"
statistics = ["Maximum"]
}
scrape_interval_seconds = 300
resource_discovery_tag_filter {
key = "k8s.io/cluster-autoscaler/enabled"
value = "true"
}
tags_to_add_to_metrics = ["eks:cluster-name"]
}

dynamic "custom_namespace" {
for_each = local.custom_namespaces
content {
name = custom_namespace.value.name
dynamic "metric" {
for_each = custom_namespace.value.metrics
content {
name = metric.value.name
statistics = metric.value.statistics
}
}
scrape_interval_seconds = custom_namespace.value.scrape_interval_seconds
custom_namespace {
name = "CoolApp"
metric {
name = "CoolMetric"
statistics = ["Maximum", "Sum"]
}
scrape_interval_seconds = 300
}
}
Loading

0 comments on commit 9f711d1

Please sign in to comment.