You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Response time
Errors
CPU runtime / storage / memory
Apdex (an amalgamated metric approximating user satisfaction)
Alerts should:
Be actionable
Send the permalink to the metric or a dashboard displaying all relevant metrics via runbook and/or a guide to troubleshooting
Be checked mostly after codebase changes but alert us to issues we can't troubleshoot so that they can be ticketed to Acquia
Alerts can:
Monitor modules/hooks
Monitor SQL DB
Other things that aren't directly relevant (?) but may be useful for troubleshooting
A reasonable metric here is if things are 10x the time they'd normally take or 10% the quality they'd normally have or some threshold undefined for errors over 5-15 minutes period, then we should have an alert.
The text was updated successfully, but these errors were encountered:
Alerts need:
Response time
Errors
CPU runtime / storage / memory
Apdex (an amalgamated metric approximating user satisfaction)
Alerts should:
Be actionable
Send the permalink to the metric or a dashboard displaying all relevant metrics via runbook and/or a guide to troubleshooting
Be checked mostly after codebase changes but alert us to issues we can't troubleshoot so that they can be ticketed to Acquia
Alerts can:
Monitor modules/hooks
Monitor SQL DB
Other things that aren't directly relevant (?) but may be useful for troubleshooting
A reasonable metric here is if things are 10x the time they'd normally take or 10% the quality they'd normally have or some threshold undefined for errors over 5-15 minutes period, then we should have an alert.
The text was updated successfully, but these errors were encountered: