-
Notifications
You must be signed in to change notification settings - Fork 90
PBM communication Internals
pbm-agent
nodes first make a connection to a single mongod
process on localhost, and then auto-create connections to some other nodes in the same cluster (if a cluster). As such pbm-agents only participate in the backups and restores of a single cluster or non-sharded replicaset.
The pbm
CLI also only works with a single cluster or non-sharded replicaset at a time. It is a stateless command though, so it can connect to different clusters from one command to the next without conflicting the operations between those clusters or non-sharded replicasets.
Definition: The replica set with the PBM control collections = the configsvr replica set in a cluster, or just the replicaset itself if a non-sharded replicaset.
The pbm
CLI connects to the replica set with the PBM control collections using a replica set connection, i.e. with a "replicaSet=XXXXX" option in it. (This means it will automatically find the current primary, and automatically switch if there is election.)
The pbm-agent
processes connects using a standalone connection, and this should only be to localhost. If it is connected to a shard node then after that connection is established it will, using info in {"_id": "shardIdentity"} document of admin.system.version collection and/or the isMaster command, automatically make a new replicaset connection to the replicaset with the PBM control collections.
Relevant code: pbm.go's New()
The ability to connect to the mongod nodes and make reads and writes in the PBM control collections is the only form of authentication and authorization used by PBM.
The pbm
CLI and pbm-agent
nodes both use a mongodb user as their authentication. No specific user name is required, but it should be a user in the "admin" collection.
N.b. The user needs to be created on every shard as well as the configsvr replicaset in a cluster. So connect to the primary in the configsvr replicaset and run the createUser command there; then repeat the same thing for every shard too. (This is requirement for DBA-use accounts in general with MongoDB; it is not special for PBM only.)
When pbm-agent
automatically makes new connections to other parts of the topology it reuses the same username and password. E.g. from URI mongodb://myuser:mypass@localhost:27018/
it makes new URI mongodb://myuser:mypass@configsvrA:27019,.../?replicaSet=configrs
. In theory the pbm
CLI and pbm-agent
could connect with a different user, but this is not tested. There may be slight differences in the privileges that the pbm CLI
uses vs. the pbm-agent
processes too, but as of v1.0 there has been no plan to separate and reduce. Keeping this part simple for now.
Relevant code: pbm.go's New()
The roles that PBM use are: "readWrite" on every collection in the "admin" db, plus the built-in named roles "backup", "restore" and "clusterMonitor". The restore stages need the most permissions. In theory the permissions could be reduced as long as the PBM user can still self-upgrade its own privileges (by having userAdmin privilege in "admin" or the ability to "readWrite" on admin.system.users) during restores, but as of v1.0 the person installing PBM is obliged to grant complete roles from the start.
(A cluster or non-shared replicaset that has no authorization enabled should presumably allow the pbm-agent and pbm CLI to connect and do everything they need, but we've not tested.)
Communication between the pbm
CLI and the pbm-agent
processes is done via collections in the cluster or non-sharded replicaset itself. The CLI starts an operation by inserting a new pbmCmd document. The agents are always watching this collection, and then respond. They in turn update other collections as they proceed.
Collection | Purpose |
---|---|
admin.pbmConfig | Stores one document with the remote storage config as one nested document |
admin.pbmCmd | Holds objects inserted by the pbm CLI to start an operation (backup or restore) |
admin.pbmOp | Lock structure. The 'winning' agent for each replicaset that does the backup will be the one that writes itself in first. |
admin.pbmBackup | The status/log of an operation. Contains the op type, parameters (e.g. the remote storage being saved to if a backup op). Each replicaset has its own processing state in a nested object. |