Skip to content

Commit

Permalink
fix prometheusrequestserror alert
Browse files Browse the repository at this point in the history
Signed-off-by: QuentinBisson <[email protected]>
  • Loading branch information
QuentinBisson committed Aug 27, 2024
1 parent 618ee88 commit 55c7942
Show file tree
Hide file tree
Showing 2 changed files with 5 additions and 2 deletions.
4 changes: 4 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,10 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

## [Unreleased]

### Fixed

- Fix PromtailRequestError to also account for 4xx and -1 errors (https://github.com/giantswarm/giantswarm/issues/31387).

## [4.12.0] - 2024-08-26

### Added
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -31,13 +31,12 @@ spec:
cancel_if_cluster_status_updating: "true"
cancel_if_node_unschedulable: "true"
cancel_if_node_not_ready: "true"
# Not tested
- alert: PromtailRequestsErrors
annotations:
description: This alert checks if that the amount of failed requests is below 10% for promtail
opsrecipe: promtail/
expr: |
100 * sum(rate(promtail_request_duration_seconds_count{status_code=~"5..|failed"}[2m])) by (cluster_id, installation, provider, pipeline, namespace, job, route, instance) / sum(rate(promtail_request_duration_seconds_count[2m])) by (cluster_id, installation, provider, pipeline, namespace, job, route, instance) > 10
100 * (sum(rate(promtail_request_duration_seconds_count{status_code!~"2.."}[2m])) by (cluster_id, installation, provider, pipeline, namespace, job, route, instance) / sum(rate(promtail_request_duration_seconds_count[2m])) by (cluster_id, installation, provider, pipeline, namespace, job, route, instance)) > 10
for: 15m
labels:
area: platform
Expand Down

0 comments on commit 55c7942

Please sign in to comment.