Skip to content

Latest commit

 

History

History
63 lines (45 loc) · 2.15 KB

absence-alert-rule-definition.md

File metadata and controls

63 lines (45 loc) · 2.15 KB

Absence alert rule definition

This document assumes that you have already read and understood the general README. If not, start reading there.

This document describes how absence alert rules are defined.

Aggregation

The absence alert rules are defined in a separate PrometheusRule resource that is managed by the operator. They are aggregated first by namespace and then by the Prometheus server.

For example, if a namespace has alert rules defined across several PrometheusRule resources for the Prometheus servers called OpenStack and Infra. The absent alert rules for this namespace would be aggregated in two new PrometheusRule resources called:

  • openstack-absent-metric-alert-rules
  • infra-absent-metric-alert-rules

Rule Template

The absence alert rule has the following template:

alert: $name
expr: absent($metric)
for: 10m
labels:
  context: absent-metrics
  severity: info
  support_group: $support_group
  service: $service
annotations:
  summary: missing $metric
  description: The metric '$metric' is missing. '$alert-name' alert using it may not fire as intended.

Consider the an alert rule that uses a metric called limes_successful_scrapes:rate5m with support group containers and service limes labels. The name of the corresponding absence alert rule would be AbsentContainersLimesSuccessfulScrapesRate5m.

The values of support_group and service labels are only included in the name if the labels are specified in the --keep-labels flag.

The description also includes a link to the playbook for operators that can be referenced on how to deal with absence alert rules.

Labels

Labels which are specified with the --keep-labels flag will be retained from the original alert rule and will be defined on the corresponding absence alert rule as is.

The support_group and service labels are a special case, they have some custom behavior which is defined in the playbook for operators.

Defaults

The following labels are always present on all absence alert rules:

  • severity: info
  • context: absent-metrics