Skip to content

Commit

Permalink
Fix prometheus agent alerts (#913)
Browse files Browse the repository at this point in the history
* Fix Prometheus agent failing alerts

Signed-off-by: QuentinBisson <[email protected]>

* Add outside_working_hours_inhibition back to agent shard alert

Signed-off-by: QuentinBisson <[email protected]>

---------

Signed-off-by: QuentinBisson <[email protected]>
  • Loading branch information
QuentinBisson authored Sep 19, 2023
1 parent 737a5e9 commit 98f70c8
Show file tree
Hide file tree
Showing 3 changed files with 5 additions and 4 deletions.
6 changes: 5 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,11 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
### Changed

- Add missing prometheus-agent inhibition to `KubeStateMetricsDown` alert
- Change time duration before `ManagementClusterDeploymentMissingAWS` pages because it is dependant on the `PrometheusAgentFailing` alert.
- Change time duration before `ManagementClusterDeploymentMissingAWS` pages because it is dependant on the `PrometheusAgentFailing` alert.

### Fixed

- Remove `cancel_if_outside_working_hours` from PrometheusAgent alerts.

## [2.132.0] - 2023-09-15

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,6 @@ spec:
cancel_if_cluster_is_not_running_prometheus_agent: "true"
cancel_if_cluster_status_creating: "true"
cancel_if_cluster_status_deleting: "true"
cancel_if_outside_working_hours: "true"
## Page Atlas if prometheus agent is missing shards to send samples to MC prometheus.
- alert: PrometheusAgentShardsMissing
annotations:
Expand Down
2 changes: 0 additions & 2 deletions test/tests/providers/global/prometheus-agent.rules.test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,6 @@ tests:
cancel_if_cluster_is_not_running_prometheus_agent: "true"
cancel_if_cluster_status_creating: "true"
cancel_if_cluster_status_deleting: "true"
cancel_if_outside_working_hours: "true"
exp_annotations:
dashboard: "promRW001/prometheus-remote-write"
description: "Prometheus agent remote write is failing."
Expand All @@ -44,7 +43,6 @@ tests:
cancel_if_cluster_is_not_running_prometheus_agent: "true"
cancel_if_cluster_status_creating: "true"
cancel_if_cluster_status_deleting: "true"
cancel_if_outside_working_hours: "true"
exp_annotations:
dashboard: "promRW001/prometheus-remote-write"
description: "Prometheus agent remote write is failing."
Expand Down

0 comments on commit 98f70c8

Please sign in to comment.