From a2c04501067dd3cb0e280fa62797a4f430d0fcc5 Mon Sep 17 00:00:00 2001 From: tonyxuqqi Date: Wed, 24 Jan 2024 17:09:36 -0800 Subject: [PATCH] titan doc update for release 7.6.0 (#15986) * titan doc update for release 7.6.0 Signed-off-by: tonyxuqqi * lint issue Signed-off-by: Qi Xu * Apply suggestions from code review * Apply suggestions from code review * Update tikv-configuration-file.md * Apply suggestions from code review * change the default value of blob-file-compression to zstd * Update tikv-configuration-file.md * Update tikv-configuration-file.md * Apply suggestions from code review Co-authored-by: Ran * polish titan doc Signed-off-by: tonyxuqqi * address comments Signed-off-by: tonyxuqqi * update gc thread count Signed-off-by: tonyxuqqi * update num-threads Signed-off-by: tonyxuqqi * titan: update titan doc for v7.6.0 (enable titan by default) * synced cn changes * Update tikv-configuration-file.md * Update titan-configuration.md * Update titan-configuration.md * Update storage-engine/titan-overview.md * Apply suggestions from code review * Update storage-engine/titan-configuration.md * Update storage-engine/titan-configuration.md * add min blob size link * Apply suggestions from code review Co-authored-by: Aolin * Update tikv-configuration-file.md Co-authored-by: Aolin * Apply suggestions from code review Co-authored-by: Aolin * Update storage-engine/titan-configuration.md Co-authored-by: Aolin * Update storage-engine/titan-configuration.md Co-authored-by: Aolin * Update tikv-configuration-file.md * Update tikv-configuration-file.md --------- Signed-off-by: tonyxuqqi Signed-off-by: Qi Xu Co-authored-by: Qi Xu Co-authored-by: xixirangrang Co-authored-by: Ran Co-authored-by: benmaoer <24819510+benmaoer@users.noreply.github.com> Co-authored-by: xixirangrang <35301108+hfxsd@users.noreply.github.com> Co-authored-by: Aolin --- storage-engine/titan-configuration.md | 147 +++++++++++++++----------- storage-engine/titan-overview.md | 6 ++ tikv-configuration-file.md | 64 ++++++++--- 3 files changed, 141 insertions(+), 76 deletions(-) diff --git a/storage-engine/titan-configuration.md b/storage-engine/titan-configuration.md index b5bb12431c3cd..8b992ae38c840 100644 --- a/storage-engine/titan-configuration.md +++ b/storage-engine/titan-configuration.md @@ -5,25 +5,27 @@ summary: Learn how to configure Titan. # Titan Configuration -This document introduces how to enable and disable [Titan](/storage-engine/titan-overview.md) using the corresponding configuration items, as well as the relevant parameters and the Level Merge feature. +This document introduces how to enable and disable [Titan](/storage-engine/titan-overview.md) using the corresponding configuration items, data convertion mechanism, the relevant parameters, and the Level Merge feature. ## Enable Titan -Titan is compatible with RocksDB, so you can directly enable Titan on the existing TiKV instances that use RocksDB. You can use one of the following two methods to enable Titan: +> **Note:** +> +> - Starting from TiDB v7.6.0, Titan is enabled by default for new clusters to enhance the performance of writing wide tables and JSON data. The default value of the [`min-blob-size`](/tikv-configuration-file.md#min-blob-size) threshold is changed from `1KB` to `32KB`. +> - Existing clusters upgraded to v7.6.0 or later versions retain the original configuration, which means that if Titan is not explicitly enabled, it still uses RocksDB. +> - If you have enabled Titan before upgrading a cluster to TiDB v7.6.0 or later versions, Titan will be enabled after the upgrade, and the configuration of [`min-blob-size`](/tikv-configuration-file.md#min-blob-size) before the upgrade will also be retained. If you do not explicitly configure the value before the upgrade, the default value of the old version `1KB` will be retained to ensure the stability of the cluster configuration after the upgrade. -+ Method 1: If you have deployed the cluster using TiUP, you can execute the `tiup cluster edit-config ${cluster-name}` command and edit the TiKV configuration file as the following example shows: +Titan is compatible with RocksDB, so you can directly enable Titan on the existing TiKV instances that use RocksDB. You can use one of the following methods to enable Titan: - {{< copyable "shell-regular" >}} ++ Method 1: If you have deployed the cluster using TiUP, you can execute the `tiup cluster edit-config ${cluster-name}` command and edit the TiKV configuration file as the following example shows: ```shell - tikv: - rocksdb.titan.enabled: true + tikv: + rocksdb.titan.enabled: true ``` Reload the configuration and TiKV will be rolling restarted dynamically: - {{< copyable "shell-regular" >}} - ```shell tiup cluster reload ${cluster-name} -R tikv ``` @@ -32,83 +34,98 @@ Titan is compatible with RocksDB, so you can directly enable Titan on the existi + Method 2: Directly edit the TiKV configuration file to enable Titan (**NOT** recommended for the production environment). - {{< copyable "" >}} - - ``` toml + ```toml [rocksdb.titan] enabled = true ``` -After Titan is enabled, the existing data stored in RocksDB is not immediately moved to the Titan engine. As new data is written to the TiKV foreground and RocksDB performs compaction, the values are progressively separated from keys and written to Titan. You can view the **TiKV Details** -> **Titan kv** -> **blob file size** panel to confirm the size of the data stored in Titan. ++ Method 3: Edit the `${cluster_name}/tidb-cluster.yaml` configuration file for TiDB Operator: -If you want to speed up the writing process, compact data of the whole TiKV cluster manually using tikv-ctl. For details, see [manual compaction](/tikv-control.md#compact-data-of-the-whole-tikv-cluster-manually). + ```yaml + spec: + tikv: + ## Base image of the component + baseImage: pingcap/tikv + ## tikv-server configuration + ## Ref: https://docs.pingcap.com/tidb/stable/tikv-configuration-file + config: | + log-level = "info" + [rocksdb] + [rocksdb.titan] + enabled = true + ``` -> **Note:** + Apply the configuration to trigger an online rolling restart of the TiDB cluster for the changes to take effect: + + ```shell + kubectl apply -f ${cluster_name} -n ${namespace} + ``` + + For more information, refer to [Configuring a TiDB Cluster in Kubernetes](https://docs.pingcap.com/tidb-in-kubernetes/stable/configure-a-tidb-cluster). + +## Data Convertion + +> **Warning:** > -> When Titan is disabled, RocksDB cannot read data that has been migrated to Titan. If Titan is incorrectly disabled on a TiKV instance with Titan already enabled (mistakenly set `rocksdb.titan.enabled` to `false`), TiKV will fail to start, and the `You have disabled titan when its data directory is not empty` error appears in the TiKV log. To correctly disabled Titan, see [Disable Titan](#disable-titan). +> When Titan is disabled, RocksDB cannot read data that has been moved to Titan. If Titan is incorrectly disabled on a TiKV instance with Titan already enabled (mistakenly set `rocksdb.titan.enabled` to `false`), TiKV will fail to start, and the `You have disabled titan when its data directory is not empty` error appears in the TiKV log. To correctly disabled Titan, see [Disable Titan](#disable-titan). -## Parameters +After Titan is enabled, the existing data stored in RocksDB is not immediately moved to the Titan engine. As new data is written to the TiKV and RocksDB performs compaction, **the values are progressively separated from keys and written to Titan**. Similarly, the data restored through BR snapshot/log, the data converted during scaling, or the data imported by TiDB Lightning Physical Import Mode, is not written directly into Titan. As compaction proceeds, the large values exceeding the default value (`32KB`) of [`min-blob-size`](/tikv-configuration-file.md#min-blob-size) in the processed SST files are separated into Titan. You can monitor the size of files stored in Titan by observing the **TiKV Details > Titan kv > blob file size** panel to estimate the data size. -To adjust Titan-related parameters using TiUP, refer to [Modify the configuration](/maintain-tidb-using-tiup.md#modify-the-configuration). +If you want to speed up the writing process, you can use tikv-ctl to compact data of the whole TiKV cluster manually. For details, see [manual compaction](/tikv-control.md#compact-data-of-the-whole-tikv-cluster-manually). The data access is continuous during the convertion from RocksDB to Titan, therefore the block cache of RocksDB significantly accelerates the data convertion process. In the test, by using tikv-ctl, a volume of 670 GiB TiKV data can be converted to Titan in one hour. -+ Titan GC thread count. +Note that the values in Titan Blob files are not continuous, and Titan's cache is at the value level, so the Blob Cache does not help during compaction. The convertion speed from Titan to RocksDB is an order of magnitude slower than that from RocksDB to Titan. In the test, it takes 12 hours to convert a volume of 800 GiB Titan data on a TiKV node to RocksDB by tikv-ctl in a full compaction. - From the **TiKV Details** -> **Thread CPU** -> **RocksDB CPU** panel, if you observe that the Titan GC threads are at full capacity for a long time, consider increasing the size of the Titan GC thread pool. +## Parameters - {{< copyable "" >}} +By properly configuring Titan parameters, you can effectively improve database performance and resource utilization. This section introduces some key parameters you can use. - ```toml - [rocksdb.titan] - max-background-gc = 1 - ``` +### `min-blob-size` -+ Value size threshold. +You can use [`min-blob-size`](/tikv-configuration-file.md#min-blob-size) to set the threshold for the value size to determine which data is stored in RocksDB and which in Titan's blob files. According to the test, `32KB` is a appropriate threshold that has better write throughput without scan throughput regression compared with RocksDB. If you want further improve write performance and accept scan performance regression, you can change the value to `1KB`. - When the size of the value written to the foreground is smaller than the threshold, this value is stored in RocksDB; otherwise, this value is stored in the blob file of Titan. Based on the distribution of value sizes, if you increase the threshold, more values are stored in RocksDB and TiKV performs better in reading small values. If you decrease the threshold, more values go to Titan, which further reduces RocksDB compactions. +### `blob-file-compression` and `zstd-dict-size` - ```toml - [rocksdb.defaultcf.titan] - min-blob-size = "1KB" - ``` +You can use [`blob-file-compression`](/tikv-configuration-file.md#blob-file-compression) to specify the compression algorithm used for values in Titan. You can also enable the `zstd` dictionary compression through [`zstd-dict-size`](/tikv-configuration-file.md#zstd-dict-size) to improve the compression rate. -+ The algorithm used for compressing values in Titan, which takes value as the unit. +### `blob-cache-size` - ```toml - [rocksdb.defaultcf.titan] - blob-file-compression = "lz4" - ``` +You can use [`blob-cache-size`](/tikv-configuration-file.md#blob-cache-size) to control the cache size of values in Titan. Larger cache size means higher read performance of Titan. However, too large a cache size causes Out of Memory (OOM) issues. -+ The size of value caches in Titan. +It is recommended to set the value of `storage.block-cache.capacity` to the store size minus the blob file size, and set `blob-cache-size` to `memory size * 50% - block cache size` according to the monitoring metrics when the database is running stably. This maximizes the blob cache size when the block cache is large enough for the whole RocksDB engine. - Larger cache size means higher read performance of Titan. However, too large a cache size causes Out of Memory (OOM). It is recommended to set the value of `storage.block-cache.capacity` to the store size minus the blob file size and set `blob-cache-size` to `memory size * 50% - block cache size` according to the monitoring metrics when the database is running stably. This maximizes the blob cache size when the block cache is large enough for the whole RocksDB engine. +### `discardable-ratio` and `max-background-gc` - ```toml - [rocksdb.defaultcf.titan] - blob-cache-size = 0 - ``` +The [`discardable-ratio`](/tikv-configuration-file.md#discardable-ratio) parameter and [`max-background-gc`](/tikv-configuration-file.md#max-background-gc) parameter significantly impact Titan's read performance and garbage collection process. -+ When the ratio of discardable data (the corresponding key has been updated or deleted) in a blob file exceeds the following threshold, Titan GC is triggered. +When the ratio of obsolete data (the corresponding key has been updated or deleted) in a blob file exceeds the threshold set by [`discardable-ratio`](/tikv-configuration-file.md#discardable-ratio), Titan GC is triggered. Reducing this threshold can reduce space amplification but can cause more frequent Titan GC. Increasing this value can reduce Titan GC, I/O bandwidth, and CPU consumption, but increase disk space usage. - ```toml - discardable-ratio = 0.5 - ``` +If you observe that the Titan GC thread is in full load for a long time from **TiKV Details** - **Thread CPU** - **RocksDB CPU**, consider adjusting [`max-background-gc`](/tikv-configuration-file.md#max-background-gc) to increase the Titan GC thread pool size. - When Titan writes the useful data of this blob file to another file, you can use the `discardable-ratio` value to estimate the upper limits of write amplification and space amplification (assuming the compression is disabled). +### `rate-bytes-per-sec` - Upper limit of write amplification = 1 / discardable_ratio +You can adjust [`rate-bytes-per-sec`](/tikv-configuration-file.md#rate-bytes-per-sec) to limit the I/O rate of RocksDB compaction, reducing its impact on foreground read and write performance during high traffic. - Upper limit of space amplification = 1 / (1 - discardable_ratio) +### Titan configuration example - From the two equations above, you can see that decreasing the value of `discardable_ratio` can reduce space amplification but causes GC to be more frequent in Titan. Increasing the value reduces Titan GC, the corresponding I/O bandwidth, and CPU consumption but increases disk usage. +The following is an example of the Titan configuration file. You can either [use TiUP to modify the configuration](/maintain-tidb-using-tiup.md#modify-the-configuration) or [configure a TiDB cluster on Kubernetes](https://docs.pingcap.com/tidb-in-kubernetes/stable/configure-a-tidb-cluster). -+ The following option limits the I/O rate of RocksDB compaction. During peak traffic, limiting RocksDB compaction, its I/O bandwidth, and its CPU consumption reduces its impact on the write and read performance of the foreground. +```toml +[rocksdb] +rate-bytes-per-sec = 0 - When Titan is enabled, this option limits the summed I/O rates of RocksDB compaction and Titan GC. If you find that the I/O and/or CPU consumption of RocksDB compaction and Titan GC is too large, set this option to a suitable value according the disk I/O bandwidth and the actual write traffic. +[rocksdb.titan] +enabled = true +max-background-gc = 1 - ```toml - [rocksdb] - rate-bytes-per-sec = 0 - ``` +[rocksdb.defaultcf.titan] +min-blob-size = "32KB" +blob-file-compression = "zstd" +zstd-dict-size = "16KB" +blob-cache-size = "0GB" +discardable-ratio = 0.5 +blob-run-mode = "normal" +level-merge = false +``` ## Disable Titan @@ -118,26 +135,30 @@ To disable Titan, you can configure the `rocksdb.defaultcf.titan.blob-run-mode` - When the option is set to `read-only`, all newly written values are written into RocksDB, regardless of the value size. - When the option is set to `fallback`, all newly written values are written into RocksDB, regardless of the value size. Also, all compacted values stored in the Titan blob file are automatically moved back to RocksDB. -To fully disable Titan for all existing and future data, you can follow these steps: +To fully disable Titan for all existing and future data, you can follow these steps. Note that you can skip Step 2 because it greatly impacts online traffic performance. In fact even without Step 2, the data compaction consumes extra I/O and CPU resources when it moves data from Titan to RocksDB, and performance will degrade (sometimes as much as 50%) when TiKV I/O or CPU resources are limited. 1. Update the configuration of the TiKV nodes you wish to disable Titan for. You can update configuration in two methods: + Execute `tiup cluster edit-config`, edit the configuration file, and execute `tiup cluster reload -R tikv`. + Manually update the configuration file and restart TiKV. - ```toml - [rocksdb.defaultcf.titan] - blob-run-mode = "fallback" - discardable-ratio = 1.0 - ``` + ```toml + [rocksdb.defaultcf.titan] + blob-run-mode = "fallback" + discardable-ratio = 1.0 + ``` + + > **Note:** + > + > When there is insufficient disk space to accommodate both Titan and RocksDB data, it is recommended to use the default value of `0.5` for [`discardable-ratio`](/tikv-configuration-file.md#discardable-ratio). In general, the default value is recommended when available disk space is less than 50%. This is because when `discardable-ratio = 1.0`, the RocksDB data continues to increase. At the same time, the recycling of existing blob files in Titan requires all the data in that file to be converted to RocksDB, which is a slow process. However, if the disk size is large enough, setting `discardable-ratio = 1.0` can reduce the GC of the blob file itself during compaction, which saves bandwidth. -2. Perform a full compaction using tikv-ctl. This process will consume large amount of I/O and CPU resources. +2. (Optional) Perform a full compaction using tikv-ctl. This process will consume a large amount of I/O and CPU resources. ```bash tikv-ctl --pd compact-cluster --bottommost force ``` -3. After the compaction is finished, you should wait for the **Blob file count** metrics under **TiKV-Details**/**Titan - kv** to decrease to `0`. +3. After the compaction is finished, wait for the **Blob file count** metrics under **TiKV-Details**/**Titan - kv** to decrease to `0`. 4. Update the configuration of these TiKV nodes to disable Titan. diff --git a/storage-engine/titan-overview.md b/storage-engine/titan-overview.md index ae7adb16d208a..33ad6cff201cd 100644 --- a/storage-engine/titan-overview.md +++ b/storage-engine/titan-overview.md @@ -31,6 +31,8 @@ The prerequisites for enabling Titan are as follows: - No range query will be performed or you do not need a high performance of range query. Because the data stored in Titan is not well-ordered, its performance of range query is poorer than that of RocksDB, especially for the query of a large range. According PingCAP's internal test, Titan's range query performance is 40% to a few times lower than that of RocksDB. - Sufficient disk space (consider reserving a space twice of the RocksDB disk consumption with the same data volume). This is because Titan reduces write amplification at the cost of disk space. In addition, Titan compresses values one by one, and its compression rate is lower than that of RocksDB. RocksDB compresses blocks one by one. Therefore, Titan consumes more storage space than RocksDB, which is expected and normal. In some situations, Titan's storage consumption can be twice that of RocksDB. +Starting from v7.6.0, Titan is enabled by default for newly created clusters. Because small TiKV values remain stored in RocksDB, you can enable Titan in this scenario as well. + If you want to improve the performance of Titan, see the blog post [Titan: A RocksDB Plugin to Reduce Write Amplification](https://pingcap.com/blog/titan-storage-engine-design-and-implementation/). ## Architecture and implementation @@ -124,3 +126,7 @@ Range Merge is an optimized approach of GC based on Level Merge. However, the bo ![RangeMerge](/media/titan/titan-7.png) Therefore, the Range Merge operation is needed to keep the number of sorted runs within a certain level. At the time of OnCompactionComplete, Titan counts the number of sorted runs in a range. If the number is large, Titan marks the corresponding blob file as ToMerge and rewrites it in the next compaction. + +### Scale out and scale in + +For backward compatibility, the TiKV snapshots are still in the RocksDB format during scaling. Because the scaled nodes are all from RocksDB at the beginning, they carry the characteristics of RocksDB, such as higher compression rate than the old TiKV nodes, smaller store size, and relatively larger write amplification in compaction. These SST files in RocksDB format will be gradually converted to Titan format after compaction. diff --git a/tikv-configuration-file.md b/tikv-configuration-file.md index f7333c1b1fffe..52e81c26cc91d 100644 --- a/tikv-configuration-file.md +++ b/tikv-configuration-file.md @@ -1037,7 +1037,9 @@ Configuration items related to Raftstore. > > Periodic full compaction is experimental. It is not recommended that you use it in the production environment. This feature might be changed or removed without prior notice. If you find a bug, you can report an [issue](https://github.com/pingcap/tidb/issues) on GitHub. -+ Sets the specific times that TiKV initiates periodic full compaction. You can specify multiple time schedules in an array. For example, `periodic-full-compact-start-times = ["03:00", "23:00"]` indicates that TiKV performs full compaction daily at 03:00 AM and 11:00 PM, based on the local time zone of the TiKV node. `periodic-full-compact-start-times = ["03:00 +0000", "23:00 +0000"]` indicates that TiKV performs full compaction daily at 03:00 AM and 11:00 PM in UTC time. ++ Set the specific times that TiKV initiates periodic full compaction. You can specify multiple time schedules in an array. For example: + + `periodic-full-compact-start-times = ["03:00", "23:00"]` indicates that TiKV performs full compaction daily at 03:00 AM and 11:00 PM, based on the local time zone of the TiKV node. + + `periodic-full-compact-start-times = ["03:00 +0000", "23:00 +0000"]` indicates that TiKV performs full compaction daily at 03:00 AM and 11:00 PM in UTC time. + Default value: `[]`, which means periodic full compaction is disabled by default. ### `periodic-full-compact-start-max-cpu` New in v7.6.0 @@ -1228,7 +1230,7 @@ Configuration items related to RocksDB ### `rate-bytes-per-sec` -+ The maximum rate permitted by RocksDB's compaction rate limiter ++ When Titan is disabled, this configuration item limits the I/O rate of RocksDB compaction to reduce the impact of RocksDB compaction on the foreground read and write performance during traffic peaks. When Titan is enabled, this configuration item limits the summed I/O rates of RocksDB compaction and Titan GC. If you find that the I/O or CPU consumption of RocksDB compaction and Titan GC is too large, set this configuration item to an appropriate value according the disk I/O bandwidth and the actual write traffic. + Default value: `10GB` + Minimum value: `0` + Unit: B|KB|MB|GB @@ -1329,8 +1331,14 @@ Configuration items related to Titan. ### `enabled` -+ Enables or disables Titan -+ Default value: `false` +> **Note:** +> +> - To enhance the performance of wide table and JSON data writing and point query, starting from TiDB v7.6.0, the default value changes from `false` to `true`, which means that Titan is enabled by default. +> - Existing clusters upgraded to v7.6.0 or later versions retain the original configuration, which means that if Titan is not explicitly enabled, it still uses RocksDB. +> - If the cluster has enabled Titan before upgrading to TiDB v7.6.0 or later versions, Titan will be retained after the upgrade, and the [`min-blob-size`](/tikv-configuration-file.md#min-blob-size) configuration before the upgrade will be retained. If you do not explicitly configure the value before the upgrade, the default value of the previous version `1KB` will be retained to ensure the stability of the cluster configuration after the upgrade. + ++ Enables or disables Titan. ++ Default value: `true` ### `dirname` @@ -1344,7 +1352,7 @@ Configuration items related to Titan. ### `max-background-gc` -+ The maximum number of GC threads in Titan ++ The maximum number of GC threads in Titan. From the **TiKV Details** > **Thread CPU** > **RocksDB CPU** panel, if you observe that the Titan GC threads are at full capacity for a long time, consider increasing the size of the Titan GC thread pool. + Default value: `4` + Minimum value: `1` @@ -1609,30 +1617,48 @@ Configuration items related to `rocksdb.defaultcf`, `rocksdb.writecf`, and `rock ## rocksdb.defaultcf.titan +> **Note:** +> +> Titan can only be enabled in `rocksdb.defaultcf`. It is not supported to enable Titan in `rocksdb.writecf`. + Configuration items related to `rocksdb.defaultcf.titan`. ### `min-blob-size` +> **Note:** +> +> - Starting from TiDB v7.6.0, Titan is enabled by default to enhance the performance of wide table and JSON data writing and point query. The default value of `min-blob-size` changes from `1KB` to `32KB`. This means that values exceeding `32KB` is stored in Titan, while other data continues to be stored in RocksDB. +> - To ensure configuration consistency, for existing clusters upgrading to TiDB v7.6.0 or later versions, if you do not explicitly set `min-blob-size` before the upgrade, TiDB retains the previous default value of `1KB`. +> - A value smaller than `32KB` might affect the performance of range scans. However, if the workload primarily involves heavy writes and point queries, you can consider decreasing the value of `min-blob-size` for better performance. + + The smallest value stored in a Blob file. Values smaller than the specified size are stored in the LSM-Tree. -+ Default value: `"1KB"` ++ Default value: `"32KB"` + Minimum value: `0` + Unit: KB|MB|GB ### `blob-file-compression` +> **Note:** +> +> - Snappy compressed files must be in the [official Snappy format](https://github.com/google/snappy). Other variants of Snappy compression are not supported. +> - Starting from TiDB v7.6.0, the default value of `blob-file-compression` changes from `"lz4"` to `"zstd"`. + + The compression algorithm used in a Blob file + Optional values: `"no"`, `"snappy"`, `"zlib"`, `"bzip2"`, `"lz4"`, `"lz4hc"`, `"zstd"` -+ Default value: `"lz4"` ++ Default value: `"zstd"` -> **Note:** -> -> The Snappy compressed file must be in the [official Snappy format](https://github.com/google/snappy). Other variants of Snappy compression are not supported. +### `zstd-dict-size` + ++ The zstd dictionary compression size. The default value is `"0KB"`, which means to disable the zstd dictionary compression. In this case, Titan compresses data based on single values, whereas RocksDB compresses data based on blocks (`32KB` by default). When the average size of Titan values is less than `32KB`, Titan's compression ratio is lower than that of RocksDB. Taking JSON as an example, the store size in Titan can be 30% to 50% larger than that of RocksDB. The actual compression ratio depends on whether the value content is suitable for compression and the similarity among different values. You can enable the zstd dictionary compression to increase the compression ratio by configuring `zstd-dict-size` (for example, set it to `16KB`). The actual store size can be lower than that of RocksDB. But the zstd dictionary compression might lead to about 10% performance regression in specific workloads. ++ Default value: `"0KB"` ++ Unit: KB|MB|GB ### `blob-cache-size` + The cache size of a Blob file + Default value: `"0GB"` + Minimum value: `0` ++ Recommended value: After database stabilization, it is recommended to set the RocksDB block cache (`storage.block-cache.capacity`) based on monitoring to maintain a block cache hit rate of at least 95%, and set `blob-cache-size` to `(total memory size) * 50% - (size of block cache)`. This is to ensure that the block cache is sufficiently large to cache the entire RocksDB, while maximizing the blob cache size. However, to prevent a significant drop in the block cache hit rate, do not set the blob cache size too large. + Unit: KB|MB|GB ### `min-gc-batch-size` @@ -1651,7 +1677,14 @@ Configuration items related to `rocksdb.defaultcf.titan`. ### `discardable-ratio` -+ The ratio at which GC is triggered for Blob files. The Blob file can be selected for GC only if the proportion of the invalid values in a Blob file exceeds this ratio. ++ When the ratio of obsolete data (the corresponding key has been updated or deleted) in a Blob file exceeds the following threshold, Titan GC is triggered. When Titan writes the valid data of this Blob file to another file, you can use the `discardable-ratio` value to estimate the upper limits of write amplification and space amplification (assuming the compression is disabled). + + Upper limit of write amplification = 1 / `discardable-ratio` + + Upper limit of space amplification = 1 / (1 - `discardable-ratio`) + + From these two equations, you can see that decreasing the value of `discardable_ratio` can reduce space amplification but results in more frequent GC in Titan. Increasing the value reduces the frequency of Titan GC, thereby lowering the corresponding I/O bandwidth and CPU usage, but increases disk usage. + + Default value: `0.5` + Minimum value: `0` + Maximum value: `1` @@ -1674,8 +1707,8 @@ Configuration items related to `rocksdb.defaultcf.titan`. + Specifies the running mode of Titan. + Optional values: - + `normal`: Writes data to the blob file when the value size exceeds `min-blob-size`. - + `read_only`: Refuses to write new data to the blob file, but still reads the original data from the blob file. + + `normal`: Writes data to the blob file when the value size exceeds [`min-blob-size`](#min-blob-size). + + `read-only`: Refuses to write new data to the blob file, but still reads the original data from the blob file. + `fallback`: Writes data in the blob file back to LSM. + Default value: `normal` @@ -2033,6 +2066,11 @@ Configuration items related to TiDB Lightning import and BR restore. + The garbage ratio threshold to trigger GC. + Default value: `1.1` +### `num-threads` New in v7.6.0 + ++ The number of GC threads when `enable-compaction-filter` is `false`. ++ Default value: `1` + ## backup Configuration items related to BR backup.