Skip to content

Commit

Permalink
Use workingHoursOnly on more alerts (#1009)
Browse files Browse the repository at this point in the history
* Use workingHoursOnly on more alerts

Signed-off-by: Marcus Noble <[email protected]>

* Removed old openstack unit tests

Signed-off-by: Marcus Noble <[email protected]>

* No longer silence all CAPA and CAPZ alerts out of hours by default

Signed-off-by: Marcus Noble <[email protected]>

* Updated unit tests

Signed-off-by: Marcus Noble <[email protected]>

---------

Signed-off-by: Marcus Noble <[email protected]>
  • Loading branch information
AverageMarcus authored Jan 22, 2024
1 parent ee2ad40 commit c4b0db3
Show file tree
Hide file tree
Showing 22 changed files with 29 additions and 408 deletions.
4 changes: 3 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,8 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
### Changed

- Changed teleport alerts to take into accont only `Provisioned` clusters
- Made use of `workingHoursOnly` template on more alerts to ensure `stable-testing` MCs don't page out of hours
- No longer silence all CAPA and CAPZ alerts out of hours by default

## [2.148.0] - 2024-01-17

Expand Down Expand Up @@ -150,7 +152,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

### Changed

- fixed `aggregation:kyverno_policy_job_status_team` expression.
- fixed `aggregation:kyverno_policy_job_status_team` expression.

### Added

Expand Down
2 changes: 1 addition & 1 deletion helm/prometheus-rules/templates/_helpers.tpl
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,7 @@ phoenix
{{- end -}}

{{- define "workingHoursOnly" -}}
{{- if has .Values.managementCluster.provider.kind (list "openstack" "capz" "capa") -}}
{{- if has .Values.managementCluster.provider.kind (list "openstack") -}}
"true"
{{- else if eq .Values.managementCluster.pipeline "stable-testing" -}}
"true"
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ spec:
cancel_if_cluster_status_creating: "true"
cancel_if_cluster_status_deleting: "true"
cancel_if_cluster_status_updating: "true"
cancel_if_outside_working_hours: "false"
cancel_if_outside_working_hours: {{ include "workingHoursOnly" . }}
severity: page
team: phoenix
topic: alb
Expand All @@ -38,7 +38,7 @@ spec:
cancel_if_cluster_status_creating: "true"
cancel_if_cluster_status_deleting: "true"
cancel_if_cluster_status_updating: "true"
cancel_if_outside_working_hours: "false"
cancel_if_outside_working_hours: {{ include "workingHoursOnly" . }}
severity: page
team: phoenix
topic: alb
Expand Down
4 changes: 2 additions & 2 deletions helm/prometheus-rules/templates/alerting-rules/kong.rules.yml
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ spec:
area: managedservices
cancel_if_cluster_status_creating: "true"
cancel_if_cluster_status_deleting: "true"
cancel_if_outside_working_hours: "false"
cancel_if_outside_working_hours: {{ include "workingHoursOnly" . }}
severity: page
team: cabbage
topic: kong
Expand All @@ -36,7 +36,7 @@ spec:
area: managedservices
cancel_if_cluster_status_creating: "true"
cancel_if_cluster_status_deleting: "true"
cancel_if_outside_working_hours: "false"
cancel_if_outside_working_hours: {{ include "workingHoursOnly" . }}
severity: page
team: cabbage
topic: kong
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@ spec:
inhibit_kube_state_metrics_down: "true"
cancel_if_prometheus_agent_down: "true"
cancel_if_kubelet_down: "true"
cancel_if_outside_working_hours: "false"
cancel_if_outside_working_hours: {{ include "workingHoursOnly" . }}
severity: page
team: atlas
topic: observability
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ spec:
cancel_if_cluster_status_creating: "true"
cancel_if_cluster_status_deleting: "true"
cancel_if_cluster_status_updating: "true"
cancel_if_outside_working_hours: "false"
cancel_if_outside_working_hours: {{ include "workingHoursOnly" . }}
severity: page
team: shield
topic: kyverno
Expand All @@ -38,7 +38,7 @@ spec:
cancel_if_cluster_status_creating: "true"
cancel_if_cluster_status_deleting: "true"
cancel_if_cluster_status_updating: "true"
cancel_if_outside_working_hours: "false"
cancel_if_outside_working_hours: {{ include "workingHoursOnly" . }}
severity: page
team: shield
topic: kyverno
Expand All @@ -53,7 +53,7 @@ spec:
cancel_if_cluster_status_creating: "true"
cancel_if_cluster_status_deleting: "true"
cancel_if_cluster_status_updating: "true"
cancel_if_outside_working_hours: "false"
cancel_if_outside_working_hours: {{ include "workingHoursOnly" . }}
severity: notify
team: shield
topic: kyverno
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ spec:
for: 30m
labels:
area: managedservices
cancel_if_outside_working_hours: "false"
cancel_if_outside_working_hours: {{ include "workingHoursOnly" . }}
severity: page
team: cabbage
topic: linkerd
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ spec:
cancel_if_cluster_status_updating: "true"
cancel_if_cluster_status_deleting: "true"
cancel_if_cluster_has_no_workers: "true"
cancel_if_outside_working_hours: "false"
cancel_if_outside_working_hours: {{ include "workingHoursOnly" . }}
severity: page
team: atlas
topic: observability
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ spec:
cancel_if_cluster_status_deleting: "true"
cancel_if_cluster_status_updating: "true"
cancel_if_scrape_timeout: "true"
cancel_if_outside_working_hours: "false"
cancel_if_outside_working_hours: {{ include "workingHoursOnly" . }}
severity: page
team: atlas
topic: observability
1 change: 0 additions & 1 deletion test/conf/providers
Original file line number Diff line number Diff line change
@@ -1,3 +1,2 @@
vintage/aws
capi/openstack
capi/capz
6 changes: 3 additions & 3 deletions test/tests/providers/capi/capz/capi-cluster.rules.test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ tests:
exp_alerts:
- exp_labels:
area: kaas
cancel_if_outside_working_hours: "true"
cancel_if_outside_working_hours: "false"
severity: page
team: phoenix
topic: managementcluster
Expand All @@ -35,7 +35,7 @@ tests:
exp_alerts:
- exp_labels:
area: kaas
cancel_if_outside_working_hours: "true"
cancel_if_outside_working_hours: "false"
severity: notify
team: phoenix
topic: managementcluster
Expand All @@ -51,7 +51,7 @@ tests:
exp_alerts:
- exp_labels:
area: kaas
cancel_if_outside_working_hours: "true"
cancel_if_outside_working_hours: "false"
severity: notify
team: phoenix
topic: managementcluster
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ tests:
exp_alerts:
- exp_labels:
area: kaas
cancel_if_outside_working_hours: "true"
cancel_if_outside_working_hours: "false"
severity: notify
team: phoenix
topic: managementcluster
Expand All @@ -35,7 +35,7 @@ tests:
exp_alerts:
- exp_labels:
area: kaas
cancel_if_outside_working_hours: "true"
cancel_if_outside_working_hours: "false"
severity: notify
team: phoenix
topic: managementcluster
Expand Down
4 changes: 2 additions & 2 deletions test/tests/providers/capi/capz/capi-machine.rules.test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ tests:
exp_alerts:
- exp_labels:
area: kaas
cancel_if_outside_working_hours: "true"
cancel_if_outside_working_hours: "false"
severity: notify
team: phoenix
topic: managementcluster
Expand All @@ -32,7 +32,7 @@ tests:
exp_alerts:
- exp_labels:
area: kaas
cancel_if_outside_working_hours: "true"
cancel_if_outside_working_hours: "false"
severity: notify
team: phoenix
topic: managementcluster
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ tests:
exp_alerts:
- exp_labels:
area: kaas
cancel_if_outside_working_hours: "true"
cancel_if_outside_working_hours: "false"
severity: notify
team: phoenix
topic: managementcluster
Expand All @@ -35,7 +35,7 @@ tests:
exp_alerts:
- exp_labels:
area: kaas
cancel_if_outside_working_hours: "true"
cancel_if_outside_working_hours: "false"
severity: notify
team: phoenix
topic: managementcluster
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ tests:
exp_alerts:
- exp_labels:
area: kaas
cancel_if_outside_working_hours: "true"
cancel_if_outside_working_hours: "false"
severity: notify
team: phoenix
topic: managementcluster
Expand All @@ -35,7 +35,7 @@ tests:
exp_alerts:
- exp_labels:
area: kaas
cancel_if_outside_working_hours: "true"
cancel_if_outside_working_hours: "false"
severity: notify
team: phoenix
topic: managementcluster
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ tests:
exp_alerts:
- exp_labels:
area: kaas
cancel_if_outside_working_hours: "true"
cancel_if_outside_working_hours: "false"
severity: notify
team: phoenix
topic: managementcluster
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ tests:
exp_alerts:
- exp_labels:
area: kaas
cancel_if_outside_working_hours: "true"
cancel_if_outside_working_hours: "false"
severity: notify
team: phoenix
topic: managementcluster
Expand All @@ -36,7 +36,7 @@ tests:
exp_alerts:
- exp_labels:
area: kaas
cancel_if_outside_working_hours: "true"
cancel_if_outside_working_hours: "false"
severity: notify
team: phoenix
topic: managementcluster
Expand Down
97 changes: 0 additions & 97 deletions test/tests/providers/capi/openstack/capi.rules.test.yml

This file was deleted.

Loading

0 comments on commit c4b0db3

Please sign in to comment.