Skip to content

Latest commit

 

History

History
151 lines (115 loc) · 8.26 KB

File metadata and controls

151 lines (115 loc) · 8.26 KB

Prometheus Remote Write Exporter

Status
Stability beta: metrics
Distributions core, contrib
Issues Open issues Closed issues
Code Owners @Aneurysm9, @rapphil

Prometheus Remote Write Exporter sends OpenTelemetry metrics to Prometheus remote write compatible backends such as Cortex, Mimir, and Thanos. By default, this exporter requires TLS and offers queued retry capabilities.

⚠️ Non-cumulative monotonic, histogram, and summary OTLP metrics are dropped by this exporter.

A design doc is available to document in detail how this exporter works.

Getting Started

The following settings are required:

  • endpoint (no default): The remote write URL to send remote write samples.

By default, TLS is enabled and must be configured under tls::

  • insecure (default = false): whether to enable client transport security for the exporter's connection.

As a result, the following parameters are also required under tls::

  • cert_file (no default): path to the TLS cert to use for TLS required connections. Should only be used if insecure is set to false.
  • key_file (no default): path to the TLS key to use for TLS required connections. Should only be used if insecure is set to false.

The following settings can be optionally configured:

  • external_labels: map of labels names and values to be attached to each metric data point
  • headers: additional headers attached to each HTTP request.
    • Note the following headers cannot be changed: Content-Encoding, Content-Type, X-Prometheus-Remote-Write-Version, and User-Agent.
  • namespace: prefix attached to each exported metric name.
  • add_metric_suffixes: If set to false, type and unit suffixes will not be added to metrics. Default: true.
  • send_metadata: If set to true, prometheus metadata will be generated and sent. Default: false.
  • remote_write_queue: fine tuning for queueing and sending of the outgoing remote writes.
    • enabled: enable the sending queue (default: true)
    • queue_size: number of OTLP metrics that can be queued. Ignored if enabled is false (default: 10000)
    • num_consumers: minimum number of workers to use to fan out the outgoing requests. (default: 5)
  • resource_to_telemetry_conversion
    • enabled (default = false): If enabled is true, all the resource attributes will be converted to metric labels by default.
  • target_info: customize target_info metric
  • export_created_metric:
    • enabled (default = false): If enabled is true, a _created metric is exported for Summary, Histogram, and Monotonic Sum metric points if StartTimeUnixNano is set.
  • max_batch_size_bytes (default = 3000000 -> ~2.861 mb): Maximum size of a batch of samples to be sent to the remote write endpoint. If the batch size is larger than this value, it will be split into multiple batches.

Example:

exporters:
  prometheusremotewrite:
    endpoint: "https://my-cortex:7900/api/v1/push"
    wal: # Enabling the Write-Ahead-Log for the exporter.
      directory: ./prom_rw # The directory to store the WAL in
      buffer_size: 100 # Optional count of elements to be read from the WAL before truncating; default of 300
      truncate_frequency: 45s # Optional frequency for how often the WAL should be truncated. It is a time.ParseDuration; default of 1m
    resource_to_telemetry_conversion:
      enabled: true # Convert resource attributes to metric labels

Example:

exporters:
  prometheusremotewrite:
    endpoint: "https://my-cortex:7900/api/v1/push"
    external_labels:
      label_name1: label_value1
      label_name2: label_value2

Advanced Configuration

Several helper files are leveraged to provide additional capabilities automatically:

Feature gates

This exporter has feature gate: exporter.prometheusremotewritexporter.RetryOn429. When this feature gate is enable the prometheus remote write exporter will retry on 429 http status code with the provided retry configuration. It currently doesn't support respecting the http header Retry-After if provided since the retry library used doesn't support this feature.

To enable it run collector with enabled feature gate exporter.prometheusremotewritexporter.RetryOn429. This can be done by executing it with one additional parameter - --feature-gates=telemetry.useOtelForInternalMetrics.

Metric names and labels normalization

OpenTelemetry metric names and attributes are normalized to be compliant with Prometheus naming rules. Details on this normalization process are described in the Prometheus translator module.

Setting resource attributes as metric labels

By default, resource attributes are added to a special metric called target_info. To select and group by metrics by resource attributes, you need to do join on target_info. For example, to select metrics with k8s_namespace_name attribute equal to my-namespace:

app_ads_ad_requests_total * on (job, instance) group_left target_info{k8s_namespace_name="my-namespace"}

Or to group by a particular attribute (for ex. k8s_namespace_name):

sum by (k8s_namespace_name) (app_ads_ad_requests_total * on (job, instance) group_left(k8s_namespace_name) target_info)

This is not a common pattern, and we recommend copying the most common resource attributes into metric labels. You can do this through the transform processor:

processor:
  transform:
    metric_statements:
      - context: datapoint
        statements:
        - set(attributes["namespace"], resource.attributes["k8s.namespace.name"])
        - set(attributes["container"], resource.attributes["k8s.container.name"])
        - set(attributes["pod"], resource.attributes["k8s.pod.name"])

After this, grouping or selecting becomes as simple as:

app_ads_ad_requests_total{namespace="my-namespace"}

sum by (namespace) (app_ads_ad_requests_total)