This document assumes that you have already read and understood the general README. If not, start reading there.
This document describes how absence alert rules are defined.
The absence alert rules are defined in a separate PrometheusRule
resource that is
managed by the operator. They are aggregated first by namespace and then by the Prometheus
server.
For example, if a namespace has alert rules defined across several PrometheusRule
resources for the Prometheus servers called OpenStack
and Infra
. The absent alert
rules for this namespace would be aggregated in two new PrometheusRule
resources
called:
openstack-absent-metric-alert-rules
infra-absent-metric-alert-rules
The absence alert rule has the following template:
alert: $name
expr: absent($metric)
for: 10m
labels:
context: absent-metrics
severity: info
support_group: $support_group
service: $service
annotations:
summary: missing $metric
description: The metric '$metric' is missing. '$alert-name' alert using it may not fire as intended.
Consider the an alert rule that uses a metric called limes_successful_scrapes:rate5m
with support group containers
and service limes
labels. The name of the corresponding
absence alert rule would be AbsentContainersLimesSuccessfulScrapesRate5m
.
The values of support_group
and service
labels are only included in the name if the
labels are specified in the --keep-labels
flag.
The description also includes a link to the playbook for operators that can be referenced on how to deal with absence alert rules.
Labels which are specified with the --keep-labels
flag will be retained from the
original alert rule and will be defined on the corresponding absence alert rule as is.
The support_group
and service
labels are a special case, they have some custom behavior which is
defined in the playbook for operators.
The following labels are always present on all absence alert rules:
severity: info
context: absent-metrics