[Monitoring] Create a dashboard for overall clusterfuzz health (#4497) · google/clusterfuzz@6add668

Commit

[Monitoring] Create a dashboard for overall clusterfuzz health (#4497)

### Motivation

As a final step for the monitoring project, this PR creates a dashboard
for overall system health.

The reasoning behind it is to have:
* Business level metrics (fuzzing hours, generated testcases, issues
filed, issues closed, testcases for which processing is pending)
* Testcase metrics (untriaged testcase age and count)
* SQS metrics (queue size, and published messages, per topic)
* Datastore/GCS metrics (number of requests, error rate, and latencies)
* Utask level metrics (duration, number of executions, error rate,
latency)

These are sufficient to apply the [RED
methodology](https://grafana.com/blog/2018/08/02/the-red-method-how-to-instrument-your-services/)
(rate, error and duration), provide high level metrics we can alert on,
and aid in troubleshooting outages with a well defined methodology.

There were two options to commit this to version control: terraform, or
butler definitions. The first was chosen, since it is the preffered long
term solution, and it is also simpler to implement, since it supports
copy pasting the JSON definition from GCP.

### Attention points

This should be automatically imported from main.tf, so it (should be)
sufficient to just place the .tf file in the same folder, and have
butler deploy handle the terraform apply step.

### How to review

Head over to go/cf-chrome-health-beta, internally. It is not expected
that the actual dashboard definition is reviewed, it is just a dump of
what was manually created in GCP.

Part of #4271

Loading branch information

vitorguidi committed Dec 27, 2024

1 parent f4dc5ca commit 6add668

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Commit

There are no files selected for viewing

0 comments on commit `6add668`

Commit

There are no files selected for viewing

0 comments on commit 6add668

0 comments on commit `6add668`