Skip to content

Commit

Permalink
Update docs/sources/flow/reference/components/otelcol.processor.proba…
Browse files Browse the repository at this point in the history
…bilistic_sampler.md

Co-authored-by: Clayton Cornell <[email protected]>
  • Loading branch information
mar4uk and clayton-cornell committed Sep 14, 2023
1 parent 732e412 commit 1f2f274
Showing 1 changed file with 15 additions and 13 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -7,12 +7,14 @@ title: otelcol.processor.probabilistic_sampler

# otelcol.processor.probabilistic_sampler

`otelcol.processor.probabilistic_sampler` accepts logs and traces data from other otelcol components and apply probabilistic sampling based on configuration options.
`otelcol.processor.probabilistic_sampler` accepts logs and traces data from other otelcol components and applies probabilistic sampling based on configuration options.


> **Note**: `otelcol.processor.probabilistic_sampler` is a wrapper over the upstream
> OpenTelemetry Collector Contrib `probabilistic_sampler` processor. Bug reports or feature
> requests will be redirected to the upstream repository, if necessary.
{{% admonition type="note" %}}
`otelcol.processor.probabilistic_sampler` is a wrapper over the upstream
OpenTelemetry Collector Contrib `probabilistic_sampler` processor. If necessary,
bug reports or feature requests will be redirected to the upstream repository.
{{% /admonition %}}

You can specify multiple `otelcol.processor.probabilistic_sampler` components by giving them
different labels.
Expand All @@ -35,28 +37,28 @@ otelcol.processor.probabilistic_sampler "LABEL" {
Name | Type | Description | Default | Required
---- |-----------|----------------------------------------------------------------------------------------------------------------------|-------------| --------
`hash_seed` | `uint32` | An integer used to compute the hash algorithm. | `0` | no
`sampling_percentage` | `float32` | Percentage at which traces or logs are sampled. | `0` | no
`sampling_percentage` | `float32` | Percentage of traces or logs sampled. | `0` | no
`attribute_source` | `string` | Defines where to look for the attribute in `from_attribute`. | `"traceID"` | no
`from_attribute` | `string` | The name of a log record attribute used for sampling purposes. | `""` | no
`sampling_priority` | `string` | The name of a log record attribute used to set a different sampling priority from the `sampling_percentage` setting. | `""` | no

`hash_seed` determines an integer to compute the hash algorithm. Could be used for both: traces and logs.
When using for logs it computes the hash of a log record.
In order for hashing to work, all collectors for a given tier (e.g. behind the same load balancer) must have the same `hash_seed`.
`hash_seed` determines an integer to compute the hash algorithm. This argument could be used for both traces and logs.
When used for logs, it computes the hash of a log record.
For hashing to work, all collectors for a given tier, for example, behind the same load balancer, must have the same `hash_seed`.
It is also possible to leverage a different `hash_seed` at different collector tiers to support additional sampling requirements.

`sampling_percentage` determines the percentage at which traces or logs are sampled; >= 100 samples all
`sampling_percentage` determines the percentage at which traces or logs are sampled. All traces or logs are sampled if you set this argument to a value greater than or equal to 100.

`attribute_source` (logs only) determines where to look for the attribute in `from_attribute`. The allowed values are `traceID` or `record`.

`from_attribute` (logs only) determines the name of a log record attribute used for sampling purposes, such as a unique log record ID. The value of the attribute is only used if the trace ID is absent or if `attribute_source` is set to `record`
`from_attribute` (logs only) determines the name of a log record attribute used for sampling purposes, such as a unique log record ID. The value of the attribute is only used if the trace ID is absent or if `attribute_source` is set to `record`.

`sampling_priority` (logs only) determines the name of a log record attribute used to set a different sampling priority from the `sampling_percentage` setting. 0 means to never sample the log record, and >= 100 means to always sample the log record
`sampling_priority` (logs only) determines the name of a log record attribute used to set a different sampling priority from the `sampling_percentage` setting. 0 means to never sample the log record, and greater than or equal to 100 means to always sample the log record.

The `probabilistic_sampler` supports two types of sampling for traces:
1. `sampling.priority` [semantic
convention](https://github.com/opentracing/specification/blob/master/semantic_conventions.md#span-tags-table) as defined by OpenTracing
2. Trace ID hashing
convention](https://github.com/opentracing/specification/blob/master/semantic_conventions.md#span-tags-table) as defined by OpenTracing.
2. Trace ID hashing.

The `sampling.priority` semantic convention takes priority over trace ID hashing.
Trace ID hashing samples based on hash values determined by trace IDs.
Expand Down

0 comments on commit 1f2f274

Please sign in to comment.