Skip to content

Commit

Permalink
Merge branch 'main' into get-rid-of-shared-alerts
Browse files Browse the repository at this point in the history
Signed-off-by: QuentinBisson <[email protected]>
  • Loading branch information
QuentinBisson committed Jun 11, 2024
2 parents 380d5a0 + ff29140 commit 61b77ec
Show file tree
Hide file tree
Showing 4 changed files with 12 additions and 7 deletions.
3 changes: 2 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,10 +29,11 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

- Fixed usage of yq, and jq in check-opsrecipes.sh
- Fetch jq with make install-tools
- Fixed and improve the check-opsrecipes.sh script so support <directory>/_index.md based ops-recipes.
- Fixed and improve the check-opsrecipes.sh script to support <directory>/_index.md based ops-recipes.
- Fixed cabbage alerts for multi-provider MCs.
- Fixed all area alert labels.
- Fixed `cert-exporter` alerts to page on all providers.
- Fix `ManagementClusterDexAppMissing` use of absent for mimir.

### Removed

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,11 @@ spec:
annotations:
description: '{{`dex-operator did not register a dex-app in giantswarm namespace.`}}'
opsrecipe: dex-operator/
{{- if .Values.mimir.enabled }}
expr: absent(dex_operator_idp_secret_expiry_time{app_namespace="giantswarm", cluster_type="management_cluster", cluster_id="{{ .Values.managementCluster.name }}", installation="{{ .Values.managementCluster.name }}", provider="{{ .Values.managementCluster.provider.kind }}", pipeline="{{ .Values.managementCluster.pipeline }}"})
{{- else }}
expr: absent(dex_operator_idp_secret_expiry_time{app_namespace="giantswarm", cluster_type="management_cluster"}) == 1
{{- end }}
for: 30m
labels:
area: kaas
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ spec:
annotations:
description: '{{`Management cluster {{ $labels.cluster_id }} has less than 3 nodes.`}}'
opsrecipe: management-cluster-less-than-three-workers/
expr: sum(kubelet_node_name{cluster_type="management_cluster"} * on (cluster_id, node) kube_node_role{role="worker", cluster_type="management_cluster"}) by (cluster_id) < 3
expr: sum(kubelet_node_name{cluster_type="management_cluster"} * on (cluster_id, node) kube_node_role{role="worker", cluster_type="management_cluster"}) by (cluster_id, installation, pipeline, provider) < 3
for: 1h
labels:
area: kaas
Expand All @@ -28,7 +28,7 @@ spec:
- alert: ManagementClusterMissingNodes
annotations:
description: '{{`Management cluster {{ $labels.cluster_id }} has less than 4 minimum nodes.`}}'
expr: sum(kube_node_status_condition{cluster_type="management_cluster", condition="Ready", status="true"}) by (cluster_id) < 4
expr: sum(kube_node_status_condition{cluster_type="management_cluster", condition="Ready", status="true"}) by (cluster_id, installation, pipeline, provider) < 4
for: 15m
labels:
area: kaas
Expand Down Expand Up @@ -62,7 +62,7 @@ spec:
- alert: ManagementClusterPodLimitAlmostReached
annotations:
description: '{{`Cluster {{ $labels.cluster_id }} is almost exceeding its pod limit.`}}'
expr: (sum(kube_pod_info{cluster_type="management_cluster"}) by (cluster_id) / sum(kube_node_status_capacity{resource="pods", cluster_type="management_cluster"}) by (cluster_id)) > 0.8
expr: (sum(kube_pod_info{cluster_type="management_cluster"}) by (cluster_id, installation, pipeline, provider) / sum(kube_node_status_capacity{resource="pods", cluster_type="management_cluster"}) by (cluster_id, installation, pipeline, provider)) > 0.8
for: 5m
labels:
area: kaas
Expand Down
6 changes: 3 additions & 3 deletions test/conf/promtool_ignore
Original file line number Diff line number Diff line change
@@ -1,7 +1,4 @@
kaas/bigmac/alerting-rules/cert-manager.rules.yml
kaas/bigmac/alerting-rules/certificate.all.rules.yml
kaas/bigmac/alerting-rules/certificate.management-cluster.rules.yml
kaas/bigmac/alerting-rules/certificate.workload-cluster.rules.yml
kaas/bigmac/alerting-rules/dex.rules.yml
kaas/bigmac/alerting-rules/teleport.rules.yml
kaas/phoenix/alerting-rules/aws-load-balancer-controller.rules.yml
Expand All @@ -27,6 +24,9 @@ kaas/turtles/alerting-rules/capi-machinedeployment.rules.yml
kaas/turtles/alerting-rules/capi-machinepool.rules.yml
kaas/turtles/alerting-rules/capi-machineset.rules.yml
kaas/turtles/alerting-rules/capi.management-cluster.rules.yml
kaas/turtles/alerting-rules/certificate.all.rules.yml
kaas/turtles/alerting-rules/certificate.management-cluster.rules.yml
kaas/turtles/alerting-rules/certificate.workload-cluster.rules.yml
kaas/turtles/alerting-rules/cluster-autoscaler.rules.yml
kaas/turtles/alerting-rules/docker.rules.yml
kaas/turtles/alerting-rules/etcd.management-cluster.rules.yml
Expand Down

0 comments on commit 61b77ec

Please sign in to comment.