-
Notifications
You must be signed in to change notification settings - Fork 6
DataDog monitoring
The Enterprise Command Center (ECC) group is responsible for Datadog issues such as adding or modifying user roles Enterprise monitoring, dashboards, or alerts for an Application Service, or to add or remove services, network devices, or administering other equipment to/from monitoring tasks.
Note: the DOTS team is no longer responsible for Datadog and will not be resolving Datadog requests.
Requesting an account entails submitting a ticket on VA's Enterprise Service Desk ServiceNow Portal at yourit.va.gov (must be on the VA network). Instructions for how to fill out a ticket for Datadog access: Datadog: Datadog Access.
Sample business justification:
As a member of the VA Virtual Regional Office (VRO) team, I am requesting access to existing Datadog dashboards on Lighthouse Delivery Infrastructure. An example of a dashboard I need to access: [url to dashboard].
To check the status of your tickets: My Tickets.
This is a revised version of an LHDI announcement in Dec 2023
- primary dashboard: VRO on LHDI
- VRO app: xample-workflows
- VRO data service: BGS
- VRO data service: BIE Kafka
- VRO data service: BIP
Our deprecated DataDog account:
- lasershark's dashboard - one of the Alpha customers
LHDI
Va.gov
- Max CFI a.k.a. Max Ratings API
- EP Merge
- Benefits - VRO Virtual Regional Office Endpoints
We've received conflicting feedback regarding use of custom metrics and Datadog's REST API. Please see the clarified use case info below, knowing that yes, the REST API is available for judicious use, provided awareness:
- Get the Datadog API and APP Key Environment Variables:
- VRO Tenants are encouraged to use the shared global helm template that has been populated in each LHDI deployment of VRO. To reference this shared gloabl template: _datadog.tpl in your project's helm, you would add the following in the "env" section of your deployment.yaml:
{{- include "vro.datadog.envVars" . | nindent 12 }}
- use EP Merge app as an example
- Please note that the environment variables as expected by the Datadog Python SDK are as follows:
- DD_API_KEY (sha fingerprint)
- DD_APP_KEY (a uuid)
- DD_SITE (https://api.ddog-gov.com)
- For more relevant documentation and additional API example code, please access the following docs:
Example call:
## Dynamic Points
# Post time-series data that can be graphed on Datadog's dashboards.
# Curl command
curl -X POST "https://api.ddog-gov.com/api/v2/series" \
-H "Accept: application/json" \
-H "Content-Type: application/json" \
-H "DD-API-KEY: ${DD_API_KEY}" \
-d @- << EOF
{
"series": [
{
"metric": "system.load.1",
"type": 0,
"points": [
{
"timestamp": 1703868203,
"value": 0.6
}
],
"resources": [
{
"name": "dummyhost",
"type": "host"
}
]
}
]
}
EOF
If used incorrectly, custom metrics can become prohibitively expensive in Datadog.
The main issue is when custom metrics are combined with highly variable tags (such as an ICN), which can greatly increase the cost. This is because we are charged for the all the metrics and tags combinations used during a billing period. For example, if we had a single failure metric but tagged with ICN, and there were failures in an a month for 1000 different users, we would be charged for 1000 metric/tag combinations. So, in general we just need to be mindful to not add unnecessary tags to any metrics we create.
LHDI now supports RDS metrics for Postgres Once enabled you can see Postgres metrics in Datadog using the metrics explorer.