Add a prometheus label mapping component #2025

vaxvms · 2024-11-04T14:51:57Z

PR Description

This PR add a prometheus component to create a label based on a source_label value and a static mapping table.

Which issue(s) this PR fixes

For a large mapping table, using regex with prometheus.relabel is inefficient

Notes to the Reviewer

PR Checklist

CHANGELOG.md updated
Documentation added
Tests updated

CLAassistant · 2024-11-04T14:52:04Z

All committers have signed the CLA.

clayton-cornell

Some initial doc review comments

docs/sources/reference/components/prometheus/prometheus.mapping.md

thampiotr · 2024-11-06T10:17:33Z

Thanks for contributing.

For a large mapping table, using regex with prometheus.relabel is inefficient

Could you share the use case and performance differences? In many cases users can use discovery.relabel to add the team=X labels - and then the overhead is much smaller, because it's not samples, but targets that are relabeled.

If the reason to add this is that prometheus.relabel it's not efficient, we will need some more data on how inefficient it is and whether it could be optimised to work faster. Any new solution we would consider will require benchmarks to prove that it's more efficient. We also generally want to avoid having many ways of doing the same thing - but there can be exceptions.

vaxvms · 2024-11-06T16:13:10Z

I have to map a label holding the service ID to a label holding the tenant ID.

My test case is using X services ID mapped to Y tenants. My test case is mapping 100 different services ID to a single tenant.
Obviously, Y<=X.

Using prometheus.relabel, with X set to 20k (and Y to 200), i'm consuming 20 vCPU. I'm also having a 49% failure rate

Using prometheus.mapping, with X set to 100k (and Y to 1k), i'm consuming 2 vCPU

As my worst case can be X == Y, i haven't try to summarize regex so i have a rule block for each service ID

thampiotr · 2024-11-19T11:34:07Z

Thanks for sharing some performance data. Have you tried to use discovery.relabel instead?

The issue with prometheus.relabel is that you are relabelling every single sample. With discovery.relabel you can relabel the targets before they are scraped.

So, for example, in a cluster of 1k targets and each exposing 1k metrics, you would have 1 million relabels with prometheus.relabel as opposed to 1k relabels with discovery.relabel.

If we find that for some reason discovery.relabel cannot be used, I would like to understand better why. If we need to optimise prometheus.relabel I would like to explore that option too before committing to create a new feature to do a very similar job. We may also need to go through proposal process to let team/people comment on this.

vaxvms · 2024-11-20T09:54:16Z

Thanks for your answer.

I'm not using alloy to scrape targets but I'm receiving data thru remote_write from multiple data producers.
The data producers only have the information about the service_id but not the tenant_id, therefore they cannot apply the correct label at the scraping time.

I'm unable to see how we could optimize prometheus.relabel to be as performant as a key lookup. We have made some attempt to share cache between instances #1692 which aims to lower the memory consumption.

Here is a extract of my test configuration, using prometheus.relabel:

prometheus.receive_http "ingestion_head" {
    http {
        listen_address = "0.0.0.0"
        listen_port    = 9000
    }
    forward_to = [prometheus.relabel.allow_list.receiver]
}

prometheus.relabel "allow_list" {
    forward_to = [prometheus.remote_write.ingestion_tail.receiver]
    rule {
        action        = "keep"
        regex         = "(.+)"
        source_labels = ["service_id"]
    }


    // One rule block per service_id
    rule {
        source_labels = ["service_id"]
        regex         = "^000001$"
        target_label  = "tenant_id"
        replacement   = "tenant_42"
    }
    
    rule {
        source_labels = ["service_id"]
        regex         = "^000002$"
        target_label  = "tenant_id"
        replacement   = "tenant_190"
    }
    
    // ...
    
    rule {
        source_labels = ["service_id"]
        regex         = "^12345678$"
        target_label  = "tenant_id"
        replacement   = "tenant_8876"
    }
}

prometheus.remote_write "ingestion_tail" {
    endpoint {
        url = "http://mimir/"
    }
}

The prometheus.relabel can be fairly heavy on the resources especially when we do a 1:1 match, hence the proposal to have a mapping. Here is an example of a working proof of concept based on the code in the PR.
Here is a schema:

prometheus.receive_http "ingestion_head" {
  http {
    listen_address = "0.0.0.0"
    listen_port    = 9000
  }
  forward_to = [prometheus.relabel.allow_list.receiver]
}

prometheus.relabel "allow_list" {
  forward_to = [prometheus.mapping.tenants_mapping.receiver]

  // First ditch everything that don't have service_id
  rule {
    action        = "keep"
    regex         = "(.+)"
    source_labels = ["service_id"]
  }
}

prometheus.mapping "tenants_mapping" {
  forward_to = [prometheus.remote_write.ingestion_tail.receiver]

  src_label_name = "service_id"
  output_label_name = "tenant_id"

  mapping = {
   "000001" = "tenant_42",
   "000002" = "tenant_190",
   // ....
   "12345678" = "tenant8876",
  }
}
prometheus.remote_write "ingestion_tail" {
        endpoint {
                url = "http://mimir/"
        }
}

wilfriedroset · 2024-11-21T17:59:10Z

What we forgot to explicit with @vaxvms is that our initial design would benefit from #521 we would remove the need for cortex-tenant

clayton-cornell · 2024-11-25T18:44:15Z

@vaxvms Did the doc topic at components/prometheus/prometheus.relabel accidentally fall off this PR in one of the force/push updates? Or was it removed intentionally? I don't see it in the file list anymore.

vaxvms · 2024-11-26T13:24:01Z

@vaxvms Did the doc topic at components/prometheus/prometheus.relabel accidentally fall off this PR in one of the force/push updates? Or was it removed intentionally? I don't see it in the file list anymore.

Are you talking about components/prometheus/prometheus.mapping ? I accidentally removed it. It back again.

I've updated the way the component work: The target_label isn't fixed anymore. each value of the source label can be mapped to severals labels

docs/sources/reference/components/prometheus/prometheus.mapping.md

vaxvms · 2024-11-28T10:38:43Z

Thanks for review @clayton-cornell
Most of the phrasing came from prometheus.relabel.md

vaxvms requested review from clayton-cornell and a team as code owners November 4, 2024 14:51

vaxvms force-pushed the main branch from 1f84323 to 9f21a6a Compare November 4, 2024 15:27

clayton-cornell reviewed Nov 5, 2024

View reviewed changes

clayton-cornell requested a review from a team November 5, 2024 18:27

clayton-cornell added the type/docs Docs Squad label across all Grafana Labs repos label Nov 5, 2024

clayton-cornell reviewed Nov 5, 2024

View reviewed changes

docs/sources/reference/components/prometheus/prometheus.mapping.md Outdated Show resolved Hide resolved

vaxvms force-pushed the main branch from 9f21a6a to 70006be Compare November 6, 2024 09:31

vaxvms force-pushed the main branch 2 times, most recently from b72677a to 99c869c Compare November 15, 2024 09:44

vaxvms force-pushed the main branch from 99c869c to ce65866 Compare November 20, 2024 09:51

vaxvms force-pushed the main branch from ce65866 to 0875da2 Compare November 25, 2024 13:36

vaxvms force-pushed the main branch from 0875da2 to b998f53 Compare November 26, 2024 12:48

vaxvms force-pushed the main branch 2 times, most recently from d77e275 to c5db3a5 Compare November 26, 2024 14:16

clayton-cornell reviewed Nov 27, 2024

View reviewed changes

Add a prometheus label mapping component

601f8f2

vaxvms force-pushed the main branch from c5db3a5 to 601f8f2 Compare November 28, 2024 10:37

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add a prometheus label mapping component #2025

Add a prometheus label mapping component #2025

vaxvms commented Nov 4, 2024

CLAassistant commented Nov 4, 2024 •

edited

Loading

clayton-cornell left a comment

thampiotr commented Nov 6, 2024

vaxvms commented Nov 6, 2024

thampiotr commented Nov 19, 2024

vaxvms commented Nov 20, 2024

wilfriedroset commented Nov 21, 2024

clayton-cornell commented Nov 25, 2024

vaxvms commented Nov 26, 2024

vaxvms commented Nov 28, 2024

Add a prometheus label mapping component #2025

Are you sure you want to change the base?

Add a prometheus label mapping component #2025

Conversation

vaxvms commented Nov 4, 2024

PR Description

Which issue(s) this PR fixes

Notes to the Reviewer

PR Checklist

CLAassistant commented Nov 4, 2024 • edited Loading

clayton-cornell left a comment

Choose a reason for hiding this comment

thampiotr commented Nov 6, 2024

vaxvms commented Nov 6, 2024

thampiotr commented Nov 19, 2024

vaxvms commented Nov 20, 2024

wilfriedroset commented Nov 21, 2024

clayton-cornell commented Nov 25, 2024

vaxvms commented Nov 26, 2024

vaxvms commented Nov 28, 2024

CLAassistant commented Nov 4, 2024 •

edited

Loading