You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem? Please describe.
The Kube Logging Operator has already a lot of well documented PrometheusRules included, which have a good readable AlertName, a short summary and description. On a daily work it's easy to spot an error and start working on the root cause.
For the unskilled stuff like OnCall team or 1st level support it's a bit overwhelmed without any deeper knowledge in the architecture which alert is related or what are to do to investigate or solve the issue.
Describe the solution you'd like
The typical use case for the Operations team is to use the Runbook feature of PrometheusRule/Annotation. Best reference is the Runbook of Prometheus project itself.
Basically in the PrometheusRule is a link to an web service with additionally instruction to the related alert. This is easy to manage, everybody can contribute to the documentation and improve the working steps.
Is your feature request related to a problem? Please describe.
The Kube Logging Operator has already a lot of well documented
PrometheusRules
included, which have a good readableAlertName
, a short summary and description. On a daily work it's easy to spot an error and start working on the root cause.For the unskilled stuff like OnCall team or 1st level support it's a bit overwhelmed without any deeper knowledge in the architecture which alert is related or what are to do to investigate or solve the issue.
Describe the solution you'd like
The typical use case for the Operations team is to use the Runbook feature of PrometheusRule/Annotation. Best reference is the Runbook of Prometheus project itself.
Basically in the
PrometheusRule
is a link to an web service with additionally instruction to the related alert. This is easy to manage, everybody can contribute to the documentation and improve the working steps.I started a proposal here
Describe alternatives you've considered
Alternative you can put all this information in the PrometheusRule itself. But that's more static and needs more cluster resources.
Additional context
https://en.wikipedia.org/wiki/Runbook
The text was updated successfully, but these errors were encountered: