Skip to content

Commit

Permalink
[mongo] Support auto-discover available databases for the monitored m…
Browse files Browse the repository at this point in the history
…ongodb instance (DataDog#17959)

* autodiscover mongodb databases

* Add database autodiscovery support

* remove print

* only list authorized collections not views

* ignore collections from config when database autodiscovery is enabled

* add changelog

* update changelog

* fix license header

* update include list with deprecated dbnames

* fix test

* update comments

* update changelog

* return databases and count

* update readme

* update config description to dbnames
  • Loading branch information
lu-zhengda authored and ravindrasojitra-crest committed Aug 5, 2024
1 parent ff22e31 commit d8d3ad4
Show file tree
Hide file tree
Showing 28 changed files with 1,331 additions and 80 deletions.
89 changes: 87 additions & 2 deletions mongo/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,12 @@ db.createUser({
"roles": [
{ role: "read", db: "admin" },
{ role: "clusterMonitor", db: "admin" },
{ role: "read", db: "local" }
{ role: "read", db: "local" },
# Grant additional read-only access to the database you want to collect collection/index statistics from.
{ role: "read", db: "mydb" },
{ role: "read", db: "myanotherdb" },
# Alternatively, grant read-only access to all databases.
{ role: "readAnyDatabase", db: "admin" }
]
})
```
Expand Down Expand Up @@ -72,7 +77,12 @@ db.createUser({
"roles": [
{ role: "read", db: "admin" },
{ role: "clusterMonitor", db: "admin" },
{ role: "read", db: "local" }
{ role: "read", db: "local" },
# Grant additional read-only access to the database you want to collect collection/index statistics from.
{ role: "read", db: "mydb" },
{ role: "read", db: "myanotherdb" },
# Alternatively, grant read-only access to all databases.
{ role: "readAnyDatabase", db: "admin" }
]
})
```
Expand Down Expand Up @@ -200,6 +210,81 @@ To configure this check for an Agent running on a host:
2. [Restart the Agent][6].
##### Database Autodiscovery
Starting from Datadog Agent v7.56, you can enable database autodiscovery to automatically collect metrics from all your databases on the MongoDB instance.
Please note that database autodiscovery is disabled by default. Read access to the autodiscovered databases is required to collect metrics from them.
To enable it, add the following configuration to your `mongo.d/conf.yaml` file:
```yaml
init_config:
instances:
## @param hosts - list of strings - required
## Hosts to collect metrics from, as is appropriate for your deployment topology.
## E.g. for a standalone deployment, specify the hostname and port of the mongod instance.
## For replica sets or sharded clusters, see instructions in the sample conf.yaml.
## Only specify multiple hosts when connecting through mongos
#
- hosts:
- <HOST>:<PORT>
## @param username - string - optional
## The username to use for authentication.
#
username: datadog
## @param password - string - optional
## The password to use for authentication.
#
password: <UNIQUE_PASSWORD>
## @param options - mapping - optional
## Connection options. For a complete list, see:
## https://docs.mongodb.com/manual/reference/connection-string/#connections-connection-options
#
options:
authSource: admin
## @param database_autodiscovery - mapping - optional
## Enable database autodiscovery to automatically collect metrics from all your MongoDB databases.
#
database_autodiscovery:
## @param enabled - boolean - required
## Enable database autodiscovery.
#
enabled: true
## @param include - list of strings - optional
## List of databases to include in the autodiscovery. Use regular expressions to match multiple databases.
## For example, to include all databases starting with "mydb", use "^mydb.*".
## By default, include is set to ".*" and all databases are included.
#
include:
- "^mydb.*"
## @param exclude - list of strings - optional
## List of databases to exclude from the autodiscovery. Use regular expressions to match multiple databases.
## For example, to exclude all databases starting with "mydb", use "^mydb.*".
## When the exclude list conflicts with include list, the exclude list takes precedence.
#
exclude:
- "^mydb2.*"
- "admin$"
## @param max_databases - integer - optional
## Maximum number of databases to collect metrics from. The default value is 100.
#
max_databases: 100
## @param refresh_interval - integer - optional
## Interval in seconds to refresh the list of databases. The default value is 600 seconds.
#
refresh_interval: 600
```
2. [Restart the Agent][6].
##### Trace collection
Datadog APM integrates with Mongo to see the traces across your distributed system. Trace collection is enabled by default in the Datadog Agent v6+. To start collecting traces:
Expand Down
73 changes: 71 additions & 2 deletions mongo/assets/configuration/spec.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -80,15 +80,81 @@ files:
type: object
properties: []
- name: dbnames
deprecation:
Agent version: "7.56.0"
Migration: |
dbnames is deprecated. Set database_autodiscovery.enabled to true to enable database autodiscovery.
Use database_autodiscovery.include or database_autodiscovery.exclude to include or exclude
specific databases to collect metrics from.
description: |
Set a list of the names of all databases to collect metrics from.
If this key does not exist, all metrics from all databases on the server will be collected.
Set a list of the names of all databases to collect dbstats metrics from.
If this key does not exist, all dbstats metrics from all databases on the server will be collected.
value:
type: array
items:
type: string
example:
[ one_database, other_database ]
- name: database_autodiscovery
description: |
Define the configuration for database autodiscovery.
Complete this section if you want to auto-discover databases on this MongoDB instance.
options:
- name: enabled
description: Enable database autodiscovery.
value:
type: boolean
example: false
display_default: false
- name: max_databases
description: The maximum number of databases this host should monitor.
value:
type: integer
example: 100
display_default: 100
- name: include
description: |
Regular expression for database names to include as part of
database autodiscovery.
Will report metrics for databases that are found in this instance,
ignores databases listed but not found.
Character casing is ignored. The regular expressions start matching from
the beginning, so to match anywhere, prepend `.*`. For exact matches append `$`.
Defaults to `.*` to include everything.
value:
type: array
items:
type: string
example:
- "mydatabase$"
- "orders.*"
display_default:
- ".*"
- name: exclude
description: |
Regular expression for database names to exclude as part of `database_autodiscovery`.
Character casing is ignored. The regular expressions start matching from the beginning,
so to match anywhere, prepend `.*`. For exact matches append `$`.
In case of conflicts, database exclusion via `exclude` takes precedence over
those found via `include`
value:
type: array
items:
type: string
example:
- "admin$"
- "config$"
- "local$"
display_default:
- "admin$"
- "config$"
- "local$"
- name: refresh_interval
description: Frequency in seconds of scans for new databases. Defaults to 10 minutes.
value:
type: integer
example: 600
display_default: 600
- name: dbm
description: |
Set to `true` enable Database Monitoring.
Expand Down Expand Up @@ -218,6 +284,9 @@ files:
* `db:<DB_NAME>` e.g. `db:<DB_NAME>`
* `collection:<COLLECTION_NAME>` e.g. `collection:<COLLECTION_NAME>`
Each collection generates many metrics, up to 8 + the number of indices on the collection for each collection.
NOTE: This option is ignored when database_autodiscovery is enabled.
Metrics are collected for all authorized collections on autodiscovered databases.
value:
type: array
items:
Expand Down
4 changes: 4 additions & 0 deletions mongo/changelog.d/17959.added
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
Support auto-discover available databases (up to 100 databases) for the monitored mongodb instance.
By default, database autodiscovery is disabled. Set `database_autodiscovery.enabled` to true to enable database autodiscovery.
When enabled, the integration will automatically discover the databases available in the monitored mongodb instance and refresh the list of databases every 10 minutes.
Use `database_autodiscovery.include` and `database_autodiscovery.exclude` to filter the list of databases to monitor.
2 changes: 2 additions & 0 deletions mongo/changelog.d/17959.deprecated
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
Configuration option `dbnames` is deprecated and will be removed in a future release.
To monitor multiple databases, set `database_autodiscovery.enabled` to true and configure `database_autodiscovery.include` and `database_autodiscovery.exclude` filters instead.
17 changes: 17 additions & 0 deletions mongo/datadog_checks/mongo/api.py
Original file line number Diff line number Diff line change
Expand Up @@ -187,6 +187,23 @@ def refresh_shards(self):
def server_status(self):
return self['admin'].command('serverStatus')

def list_authorized_collections(self, db_name):
try:
return self[db_name].list_collection_names(
filter={"type": "collection"}, # Only return collections, not views
authorizedCollections=True,
)
except OperationFailure:
# The user is not authorized to run listCollections on this database.
# This is NOT a critical error, so we log it as a warning.
self._log.warning(
"Not authorized to run 'listCollections' on db %s, "
"please make sure the user has read access on the database or "
"add the database to the `database_autodiscovery.exclude` list in the configuration file",
db_name,
)
return []

@property
def hostname(self):
try:
Expand Down
13 changes: 8 additions & 5 deletions mongo/datadog_checks/mongo/collectors/coll_stats.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,9 +12,6 @@
class CollStatsCollector(MongoCollector):
"""Collects metrics from the 'collstats' command.
Note: Collecting those metrics requires that 'collection' is set in the 'additional_metrics' section of the config.
Also, the collections to be monitored have to be explicitly listed in the config as well.
Finally, it is currently not possible to monitor collections from multiple databases using a single check instance.
The check will always use the main database name defined in the configuration (or 'admin' by default).
"""

def __init__(self, check, db_name, tags, coll_names=None):
Expand All @@ -26,11 +23,17 @@ def compatible_with(self, deployment):
# Can only be run once per cluster.
return deployment.is_principal()

def _get_collections(self, api):
if self.coll_names:
return self.coll_names
return api.list_authorized_collections(self.db_name)

def collect(self, api):
# Ensure that you're on the right db
db = api[self.db_name]
# Loop through the collections
for coll_name in self.coll_names:
coll_names = self._get_collections(api)

for coll_name in coll_names:
# Grab the stats from the collection
payload = {'collection': db.command("collstats", coll_name)}
additional_tags = ["db:%s" % self.db_name, "collection:%s" % coll_name]
Expand Down
10 changes: 9 additions & 1 deletion mongo/datadog_checks/mongo/collectors/index_stats.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,14 +17,22 @@ def compatible_with(self, deployment):
# Can only be run once per cluster.
return deployment.is_principal()

def _get_collections(self, api):
if self.coll_names:
return self.coll_names
return api.list_authorized_collections(self.db_name)

def collect(self, api):
db = api[self.db_name]
for coll_name in self.coll_names:
coll_names = self._get_collections(api)

for coll_name in coll_names:
try:
for stats in db[coll_name].aggregate([{"$indexStats": {}}], cursor={}):
idx_tags = self.base_tags + [
"name:{0}".format(stats.get('name', 'unknown')),
"collection:{0}".format(coll_name),
"db:{0}".format(self.db_name),
]
val = int(stats.get('accesses', {}).get('ops', 0))
self.gauge('mongodb.collection.indexes.accesses.ops', val, idx_tags)
Expand Down
35 changes: 35 additions & 0 deletions mongo/datadog_checks/mongo/config.py
Original file line number Diff line number Diff line change
Expand Up @@ -110,7 +110,11 @@ def __init__(self, instance, log):
# MongoDB instance hostname override
self.reported_database_hostname = instance.get('reported_database_hostname', None)

# MongoDB database auto-discovery, disabled by default
self.database_autodiscovery_config = self._get_database_autodiscovery_config(instance)

# Generate tags for service checks and metrics
# TODO: service check and metric tags should be updated to be dynamic with auto-discovered databases
self.service_check_tags = self._compute_service_check_tags()
self.metric_tags = self._compute_metric_tags()

Expand Down Expand Up @@ -167,3 +171,34 @@ def operation_samples(self):
self._operation_samples_config.get('explained_operations_per_hour_per_query', 10)
),
}

def _get_database_autodiscovery_config(self, instance):
database_autodiscovery_config = instance.get('database_autodiscovery', {"enabled": False})
if database_autodiscovery_config['enabled']:
if self.db_name != 'admin':
# If database_autodiscovery is enabled, the `database` parameter should not be set
# because we want to monitor all databases. Unless the `database` parameter is set to 'admin'.
self.log.warning(
"The `database` parameter should not be set when `database_autodiscovery` is enabled. "
"The `database` parameter will be ignored."
)
if self.coll_names:
self.log.warning(
"The `collections` parameter should not be set when `database_autodiscovery` is enabled. "
"The `collections` parameter will be ignored."
)
if self.db_names:
# dbnames is deprecated and will be removed in a future version
self.log.warning(
"The `dbnames` parameter is deprecated and will be removed in a future version. "
"To monitor more databases, enable `database_autodiscovery` and use "
"`database_autodiscovery.include` instead."
)
include_list = [f"{db}$" for db in self.db_names] # Append $ to each db name for exact match
if not database_autodiscovery_config['enabled']:
# if database_autodiscovery is not enabled, we should enable it
database_autodiscovery_config['enabled'] = True
if not database_autodiscovery_config.get('include'):
# if database_autodiscovery is enabled but include list is not set, set the include list
database_autodiscovery_config['include'] = include_list
return database_autodiscovery_config
13 changes: 13 additions & 0 deletions mongo/datadog_checks/mongo/config_models/instance.py
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,18 @@ class CustomQuery(BaseModel):
tags: Optional[tuple[str, ...]] = None


class DatabaseAutodiscovery(BaseModel):
model_config = ConfigDict(
arbitrary_types_allowed=True,
frozen=True,
)
enabled: Optional[bool] = None
exclude: Optional[tuple[str, ...]] = None
include: Optional[tuple[str, ...]] = None
max_databases: Optional[int] = None
refresh_interval: Optional[int] = None


class MetricPatterns(BaseModel):
model_config = ConfigDict(
arbitrary_types_allowed=True,
Expand Down Expand Up @@ -74,6 +86,7 @@ class InstanceConfig(BaseModel):
connection_scheme: Optional[str] = None
custom_queries: Optional[tuple[CustomQuery, ...]] = None
database: Optional[str] = None
database_autodiscovery: Optional[DatabaseAutodiscovery] = None
database_instance_collection_interval: Optional[float] = None
dbm: Optional[bool] = None
dbnames: Optional[tuple[str, ...]] = None
Expand Down
Loading

0 comments on commit d8d3ad4

Please sign in to comment.