From 85ce20d1553c45b74f54e349fda1e760e776f633 Mon Sep 17 00:00:00 2001 From: Aolin Date: Wed, 11 Dec 2024 15:04:12 +0800 Subject: [PATCH] pd: add patrol-region-worker-count (#19600) --- dynamic-config.md | 3 ++- pd-configuration-file.md | 11 ++++++++++- pd-control.md | 10 ++++++++-- 3 files changed, 20 insertions(+), 4 deletions(-) diff --git a/dynamic-config.md b/dynamic-config.md index 3ff3870c3eaa3..b6851f60758db 100644 --- a/dynamic-config.md +++ b/dynamic-config.md @@ -281,7 +281,8 @@ The following PD configuration items can be modified dynamically: | `cluster-version` | The cluster version | | `schedule.max-merge-region-size` | Controls the size limit of `Region Merge` (in MiB) | | `schedule.max-merge-region-keys` | Specifies the maximum numbers of the `Region Merge` keys | -| `schedule.patrol-region-interval` | Determines the frequency at which `replicaChecker` checks the health state of a Region | +| `schedule.patrol-region-interval` | Determines the frequency at which the checker inspects the health state of a Region | +| `scheduler.patrol-region-worker-count` | Controls the number of concurrent operators created by the checker when inspecting the health state of a Region | | `schedule.split-merge-interval` | Determines the time interval of performing split and merge operations on the same Region | | `schedule.max-snapshot-count` | Determines the maximum number of snapshots that a single store can send or receive at the same time | | `schedule.max-pending-peer-count` | Determines the maximum number of pending peers in a single store | diff --git a/pd-configuration-file.md b/pd-configuration-file.md index 008930e2b3943..49965012c6742 100644 --- a/pd-configuration-file.md +++ b/pd-configuration-file.md @@ -279,9 +279,18 @@ Configuration items related to scheduling ### `patrol-region-interval` -+ Controls the running frequency at which `replicaChecker` checks the health state of a Region. The smaller this value is, the faster `replicaChecker` runs. Normally, you do not need to adjust this parameter. ++ Controls the running frequency at which the checker inspects the health state of a Region. The smaller this value is, the faster the checker runs. Normally, you do not need to adjust this configuration. + Default value: `10ms` +### `patrol-region-worker-count` New in v8.5.0 + +> **Warning:** +> +> Setting this configuration item to a value greater than 1 enables concurrent checks. This is an experimental feature. It is not recommended that you use it in the production environment. This feature might be changed or removed without prior notice. If you find a bug, you can report an [issue](https://github.com/tikv/pd/issues) on GitHub. + ++ Controls the number of concurrent [operators](/glossary.md#operator) created by the checker when inspecting the health state of a Region. Normally, you do not need to adjust this configuration. ++ Default value: `1` + ### `split-merge-interval` + Controls the time interval between the `split` and `merge` operations on the same Region. That means a newly split Region will not be merged for a while. diff --git a/pd-control.md b/pd-control.md index 086846a85ab13..2f706a2020132 100644 --- a/pd-control.md +++ b/pd-control.md @@ -233,10 +233,16 @@ Usage: config set region-score-formula-version v2 ``` -- `patrol-region-interval` controls the execution frequency that `replicaChecker` checks the health status of Regions. A shorter interval indicates a higher execution frequency. Generally, you do not need to adjust it. +- `patrol-region-interval` controls the execution frequency that the checker inspects the health status of Regions. A shorter interval indicates a higher execution frequency. Generally, you do not need to adjust it. ```bash - config set patrol-region-interval 10ms // Set the execution frequency of replicaChecker to 10ms + config set patrol-region-interval 10ms // Set the execution frequency of the checker to 10ms + ``` + +- `patrol-region-worker-count` controls the number of concurrent [operators](/glossary.md#operator) created by the checker when inspecting the health state of a Region. Normally, you do not need to adjust this configuration. Setting this configuration item to a value greater than 1 enables concurrent checks. Currently, this feature is experimental, and it is not recommended that you use it in the production environment. + + ```bash + config set patrol-region-worker-count 2 // Set the checker concurrency to 2 ``` - `max-store-down-time` controls the time that PD decides the disconnected store cannot be restored if exceeded. If PD does not receive heartbeats from a store within the specified period of time, PD adds replicas in other nodes.