Created dedicated section for chunks storage in the doc (cortexprojec…

…t#3407) * Created dedicated section to chunks storage in the doc Signed-off-by: Marco Pracucci <[email protected]> * Fixed white noise Signed-off-by: Marco Pracucci <[email protected]> * Addressed review comments Signed-off-by: Marco Pracucci <[email protected]> * Fixed links Signed-off-by: Marco Pracucci <[email protected]> * Fixed TestPurger_Restarts flakyness Signed-off-by: Marco Pracucci <[email protected]>
krishnateja325 · Oct 29, 2020 · 67648aa · 67648aa
1 parent 98945ad
commit 67648aa
Show file tree

Hide file tree

Showing 42 changed files with 329 additions and 220 deletions.
diff --git a/docs/_index.md b/docs/_index.md
@@ -28,13 +28,13 @@ Cortex is primarily used as a [remote write](https://prometheus.io/docs/operatin
 
 ## Documentation
 
-Read the [getting started guide](getting-started/) if you're new to the
+Read the [getting started guide](getting-started/_index.md) if you're new to the
 project. Before deploying Cortex with a permanent storage backend you
 should read:
 
 1. [An overview of Cortex's architecture](architecture.md)
-1. [A guide to running Cortex](production/running.md)
-1. [Information regarding configuring Cortex](configuration/arguments.md)
+1. [A guide to running Cortex chunks storage](guides/running-chunks-storage-in-production.md)
+1. [Information regarding configuring Cortex](configuration/_index.md)
 
 For a guide to contributing to Cortex, see the [contributor guidelines](contributing/).
 

diff --git a/docs/api/_index.md b/docs/api/_index.md
@@ -1,7 +1,7 @@
 ---
 title: "HTTP API"
 linkTitle: "HTTP API"
-weight: 5
+weight: 7
 slug: api
 menu:
 no_section_index_title: true
@@ -83,7 +83,7 @@ When multi-tenancy is enabled, endpoints requiring authentication are expected t
 
 Multi-tenancy can be enabled/disabled via the CLI flag `-auth.enabled` or its respective YAML config option.
 
-_For more information, please refer to the dedicated [Authentication and Authorisation](../production/auth.md) guide._
+_For more information, please refer to the dedicated [Authentication and Authorisation](../guides/authentication-and-authorisation.md) guide._
 
 ## All services
 

diff --git a/docs/architecture.md b/docs/architecture.md
@@ -44,11 +44,7 @@ For this reason, the chunks storage consists of:
   * [Google Cloud Storage](https://cloud.google.com/storage/)
   * [Microsoft Azure Storage](https://azure.microsoft.com/en-us/services/storage/)
 
-Internally, the access to the chunks storage relies on a unified interface called "chunks store". Unlike other Cortex components, the chunk store is not a separate service, but rather a library embedded in the services that need to access the long-term storage: [ingester](#ingester), [querier](#querier) and [ruler](#ruler).
-
-The chunk and index format are versioned, this allows Cortex operators to upgrade the cluster to take advantage of new features and improvements. This strategy enables changes in the storage format without requiring any downtime or complex procedures to rewrite the stored data. A set of schemas are used to map the version while reading and writing time series belonging to a specific period of time.
-
-The current schema recommendation is the **v9 schema** for most use cases and **v10 schema** if you expect to have very high cardinality metrics (v11 is still experimental). For more information about the schema, please check out the [Schema](configuration/schema-config-reference.md) documentation.
+For more information, please check out the [Chunks storage](./chunks-storage/_index.md) documentation.
 
 ### Blocks storage
 
@@ -61,8 +57,8 @@ The blocks storage doesn't require a dedicated storage backend for the index. Th
 * [Amazon S3](https://aws.amazon.com/s3)
 * [Google Cloud Storage](https://cloud.google.com/storage/)
 * [Microsoft Azure Storage](https://azure.microsoft.com/en-us/services/storage/)
-* [Local Filesystem](https://thanos.io/storage.md/#filesystem) (single node only)
 * [OpenStack Swift](https://wiki.openstack.org/wiki/Swift) (experimental)
+* [Local Filesystem](https://thanos.io/storage.md/#filesystem) (single node only)
 
 For more information, please check out the [Blocks storage](./blocks-storage/_index.md) documentation.
 
@@ -110,7 +106,7 @@ The supported KV stores for the HA tracker are:
 
 Note: Memberlist is not supported. Memberlist-based KV store propagates updates using gossip, which is very slow for HA purposes: result is that different distributors may see different Prometheus server as elected HA replica, which is definitely not desirable.
 
-For more information, please refer to [config for sending HA pairs data to Cortex](production/ha-pair-handling.md) in the documentation.
+For more information, please refer to [config for sending HA pairs data to Cortex](guides/ha-pair-handling.md) in the documentation.
 
 #### Hashing
 

diff --git a/docs/blocks-storage/_index.md b/docs/blocks-storage/_index.md
@@ -1,7 +1,7 @@
 ---
 title: "Blocks Storage"
 linkTitle: "Blocks Storage"
-weight: 8
+weight: 3
 menu:
 ---
 
@@ -12,6 +12,7 @@ The supported backends for the blocks storage are:
 * [Amazon S3](https://aws.amazon.com/s3)
 * [Google Cloud Storage](https://cloud.google.com/storage/)
 * [Microsoft Azure Storage](https://azure.microsoft.com/en-us/services/storage/)
+* [OpenStack Swift](https://wiki.openstack.org/wiki/Swift) (experimental)
 * [Local Filesystem](https://thanos.io/storage.md/#filesystem) (single node only)
 
 _Internally, some components are based on [Thanos](https://thanos.io), but no Thanos knowledge is required in order to run it._
@@ -30,7 +31,7 @@ The **[store-gateway](./store-gateway.md)** is responsible to query blocks and i
 
 The **[compactor](./compactor.md)** is responsible to merge and deduplicate smaller blocks into larger ones, in order to reduce the number of blocks stored in the long-term storage for a given tenant and query them more efficiently. The compactor is optional but highly recommended.
 
-Finally, the **table-manager** and the [**schema**](../configuration/schema-config-reference.md) configuration are **not used** by the blocks storage.
+Finally, the [**table-manager**](../chunks-storage/table-manager.md) and the [**schema config**](../chunks-storage/schema-config.md) are **not used** by the blocks storage.
 
 ### The write path
 
@@ -44,7 +45,7 @@ In order to effectively use the **WAL** and being able to recover the in-memory
 
 The series sharding and replication done by the distributor doesn't change based on the storage engine.
 
-It's important to note that - differently than the [chunks storage](../architecture.md#chunks-storage-default) - due to the replication factor N (typically 3), each time series is stored by N ingesters. Since each ingester writes its own block to the long-term storage, this leads a storage utilization N times more than the chunks storage. [Compactor](./compactor.md) solves this problem by merging blocks from multiple ingesters into a single block, and removing duplicated samples.
+It's important to note that - differently than the [chunks storage](../chunks-storage/_index.md) - due to the replication factor N (typically 3), each time series is stored by N ingesters. Since each ingester writes its own block to the long-term storage, this leads a storage utilization N times more than the chunks storage. [Compactor](./compactor.md) solves this problem by merging blocks from multiple ingesters into a single block, and removing duplicated samples. After blocks compaction, the storage utilization is significantly smaller compared to the chunks storage for the same exact series and samples.
 
 For more information, please refer to the following dedicated sections:
 

diff --git a/docs/case-studies/_index.md b/docs/case-studies/_index.md
@@ -3,6 +3,6 @@ title: "Case Studies"
 linkTitle: "Case Studies"
 slug: case-studies
 no_section_index_title: true
-weight: 11
+weight: 9
 menu:
 ---
diff --git a/docs/chunks-storage/_index.md b/docs/chunks-storage/_index.md
@@ -0,0 +1,36 @@
+---
+title: "Chunks Storage"
+linkTitle: "Chunks Storage"
+weight: 4
+menu:
+---
+
+The chunks storage is a Cortex storage engine which stores each single time series into a separate object called _chunk_. Each chunk contains the samples for a given period (defaults to 12 hours). Chunks are then indexed by time range and labels, in order to provide a fast lookup across many (over millions) chunks. For this reason, the Cortex chunks storage requires two backend storages: a key-value store for the index and an object store for the chunks.
+
+The supported backends for the **index store** are:
+
+* [Amazon DynamoDB](https://aws.amazon.com/dynamodb)
+* [Google Bigtable](https://cloud.google.com/bigtable)
+* [Apache Cassandra](https://cassandra.apache.org)
+
+The supported backends for the **chunks store** are:
+
+* [Amazon DynamoDB](https://aws.amazon.com/dynamodb)
+* [Google Bigtable](https://cloud.google.com/bigtable)
+* [Apache Cassandra](https://cassandra.apache.org)
+* [Amazon S3](https://aws.amazon.com/s3)
+* [Google Cloud Storage](https://cloud.google.com/storage/)
+* [Microsoft Azure Storage](https://azure.microsoft.com/en-us/services/storage/)
+
+## Storage versioning
+
+The chunks storage is based on a custom data format. The **chunks and index format are versioned**: this allows Cortex operators to upgrade the cluster to take advantage of new features and improvements. This strategy enables changes in the storage format without requiring any downtime or complex procedures to rewrite the stored data. A set of schemas are used to map the version while reading and writing time series belonging to a specific period of time.
+
+The current schema recommendation is the **v9 schema** for most use cases and **v10 schema** if you expect to have very high cardinality metrics (v11 is still experimental). For more information about the schema, please check out the [schema configuration](schema-config.md).
+
+## Guides
+
+The following step-by-step guides can help you setting up Cortex running with the chunks storage:
+
+- [Running Cortex chunks storage in Production](../guides/running-chunks-storage-in-production.md)
+- [Running Cortex chunks storage with Cassandra](../guides/running-chunks-storage-with-cassandra.md)
diff --git a/docs/production/storage-aws.md → docs/chunks-storage/aws-tips.md b/docs/production/storage-aws.md → docs/chunks-storage/aws-tips.md
@@ -1,15 +1,13 @@
 ---
-title: "Running Cortex with AWS Services"
-linkTitle: "Running Cortex with AWS Services"
-weight: 2
-slug: aws
+title: "AWS tips"
+linkTitle: "AWS tips"
+weight: 10
+slug: aws-tips
 ---
 
-[this is a work in progress]
+This page shares some tips and things to take in consideration when running Cortex chunks storage on AWS.
 
-See also the [Running in Production](running.md) document.
-
-## Credentials
+## AWS Credentials
 
 You can supply credentials to Cortex by setting environment variables
 `AWS_ACCESS_KEY_ID`, `AWS_SECRET_ACCESS_KEY` (and `AWS_SESSION_TOKEN`
@@ -20,15 +18,14 @@ if you use MFA), or use a short-term token solution such as
 
 Note that the choices for the chunks storage backend are: "chunks" of
 timeseries data in S3 and index in DynamoDB, or everything in DynamoDB.
-Using just S3 is not an option, unless you use the [blocks storage](../../blocks-storage/) engine.
+Using just S3 is not an option, unless you use the [blocks storage](../blocks-storage/_index.md) engine.
 
 Broadly S3 is much more expensive to read and write, while DynamoDB is
 much more expensive to store over months.  S3 charges differently, so
 the cross-over will depend on the size of your chunks, and how long
 you keep them.  Very roughly: for 3KB chunks if you keep them longer
 than 8 months then S3 is cheaper.
 
-
 ## DynamoDB capacity provisioning
 
 By default, the Cortex Tablemanager will provision tables with 1,000
@@ -89,4 +86,4 @@ Several things to note here:
    older data, which is never written and only read sporadically.
 -  If you want to add AWS tags to the created DynamoDB tables you
    can do it by adding a `tags` map to your schema definition. See
-   [`schema configuration`](../configuration/schema-config-reference.md)
+   [`schema configuration`](./schema-config.md)
diff --git a/docs/production/caching.md → docs/chunks-storage/caching.md b/docs/production/caching.md → docs/chunks-storage/caching.md
@@ -1,7 +1,7 @@
 ---
-title: "Caching in Cortex"
-linkTitle: "Caching in Cortex"
-weight: 5
+title: "Caching"
+linkTitle: "Caching"
+weight: 4
 slug: caching
 ---
 

diff --git a/docs/production/ingesters-with-wal.md → docs/chunks-storage/ingesters-with-wal.md b/docs/production/ingesters-with-wal.md → docs/chunks-storage/ingesters-with-wal.md
@@ -5,7 +5,7 @@ weight: 5
 slug: ingesters-with-wal
 ---
 
-Currently the ingesters running in the chunks storage mode, store all their data in memory. If there is a crash, there could be loss of data. WAL helps fill this gap in reliability.
+By default, ingesters running with the chunks storage, store all their data in memory. If there is a crash, there could be loss of data. The Write-Ahead Log (WAL) helps fill this gap in reliability.
 
 To use WAL, there are some changes that needs to be made in the deployment.
 
@@ -93,7 +93,7 @@ PS: Given you have to scale down 1 ingester at a time, you can pipeline the shut
 
 **Fallback option**
 
-There is a `flusher` target that can be used to flush the data in the WAL. It's config can be found [here](../configuration/config-file-reference.md#flusher-config). As flusher depends on the chunk store and the http API components, you need to also set all the config related to them similar to ingesters (see [api,storage,chunk_store,limits,runtime_config](../configuration/config-file-reference.md#supported-contents-and-default-values-of-the-config-file) and [schema](../configuration/schema-config-reference.md)). Pro tip: Re-use the ingester config and set the `target` as `flusher` with additional flusher config, the irrelevant config will be ignored.
+There is a `flusher` target that can be used to flush the data in the WAL. It's config can be found [here](../configuration/config-file-reference.md#flusher-config). As flusher depends on the chunk store and the http API components, you need to also set all the config related to them similar to ingesters (see [api,storage,chunk_store,limits,runtime_config](../configuration/config-file-reference.md#supported-contents-and-default-values-of-the-config-file) and [schema](schema-config.md)). Pro tip: Re-use the ingester config and set the `target` as `flusher` with additional flusher config, the irrelevant config will be ignored.
 
 You can run it as a Kubernetes job which will:
 

diff --git a/docs/chunks-storage/schema-config.md b/docs/chunks-storage/schema-config.md
@@ -0,0 +1,149 @@
+---
+title: "Schema Configuration"
+linkTitle: "Schema Configuration"
+weight: 2
+slug: schema-configuration
+---
+
+Cortex chunks storage stores indexes and chunks in table-based data storages. When such a storage type is used, multiple tables are created over the time: each table - also called periodic table - contains the data for a specific time range. The table-based storage layout is configured through a configuration file called **schema config**.
+
+_The schema config is used only by the chunks storage, while it's **not** used by the [blocks storage](../blocks-storage/_index.md) engine._
+
+## Design
+
+The table based design brings two main benefits:
+
+1. **Schema config changes**<br />
+   Each table is bounded to a schema config and version, so that changes can be introduced over the time and multiple schema configs can coexist.
+2. **Retention**<br />
+   The retention is implemented deleting an entire table, which allows to have fast delete operations.
+
+The [**table-manager**](./table-manager.md) is the Cortex service responsible for creating a periodic table before its time period begins, and deleting it once its data time range exceeds the retention period.
+
+## Periodic tables
+
+A periodic table stores the index or chunks relative to a specific period of time. The duration of the time range of the data stored in a single table and its storage type is configured in the `configs` block of the [schema config](#schema-config) file.
+
+The `configs` block can contain multiple entries. Each config defines the storage used between the day set in `from` (in the format `yyyy-mm-dd`) and the next config, or "now" in the case of the last schema config entry.
+
+This allows to have multiple non-overlapping schema configs over the time, in order to perform schema version upgrades or change storage settings (including changing the storage type).
+
+![Schema config - periodic table](/images/chunks-storage/schema-config-periodic-tables.png)
+<!-- Diagram source at https://docs.google.com/presentation/d/1bHp8_zcoWCYoNU2AhO2lSagQyuIrghkCncViSqn14cU/edit -->
+
+The write path hits the table where the sample timestamp falls into (usually the last table, except short periods close to the end of a table and the beginning of the next one), while the read path hits the tables containing data for the query time range.
+
+## Schema versioning
+
+Cortex supports multiple schema version (currently there are 11) but we recommend running with the **v9 schema** for most use cases and **v10 schema** if you expect to have very high cardinality metrics. You can move from one schema to another if a new schema fits your purpose better, but you still need to configure Cortex to make sure it can read the old data in the old schemas.
+
+## Schema config
+
+The path to the schema config YAML file can be specified to Cortex via the CLI flag `-schema-config-file` and has the following structure.
+
+```yaml
+configs: []<period_config>
+```
+
+### `<period_config>`
+
+The `period_config` configures a single period during which the storage is using a specific schema version and backend storage.
+
+```yaml
+# The starting date in YYYY-MM-DD format (eg. 2020-03-01).
+from: <string>
+
+# The key-value store to use for the index. Supported values are:
+# aws-dynamo, bigtable, bigtable-hashed, cassandra, boltdb.
+store: <string>
+
+# The object store to use for the chunks. Supported values are:
+# s3, aws-dynamo, bigtable, bigtable-hashed, gcs, cassandra, filesystem.
+# If none is specified, "store" is used for storing chunks as well.
+[object_store: <string>]
+
+# The schema version to use. Supported versions are: v1, v2, v3, v4, v5,
+# v6, v9, v10, v11. We recommended v9 for most use cases, alternatively
+# v10 if you expect to have very high cardinality metrics.
+schema: <string>
+
+index: <periodic_table_config>
+chunks: <periodic_table_config>
+```
+
+### `periodic_table_config`
+
+The `periodic_table_config` configures the tables for a single period.
+
+```yaml
+# The prefix to use for the table names.
+prefix: <string>
+
+# The duration for each table. A new table is created every "period", which also
+# represents the granularity with which retention is enforced. Typically this value
+#is set to 1w (1 week). Must be a multiple of 24h.
+period: <duration>
+
+# The tags to be set on the created table.
+tags: <map[string]string>
+```
+
+## Schema config example
+
+The following example shows an advanced schema file covering different changes over the course of a long period. It starts with v9 and just Bigtable. Later it was migrated to GCS as the object store, and finally moved to v10.
+
+_This is a complex schema file showing several changes changes over the time, while a typical schema config file usually has just one or two schema versions._
+
+```
+configs:
+  # Starting from 2018-08-23 Cortex should store chunks and indexes
+  # on Google BigTable using weekly periodic tables. The chunks table
+  # names will be prefixed with "dev_chunks_", while index tables will be
+  # prefixed with "dev_index_".
+  - from: "2018-08-23"
+    schema: v9
+    chunks:
+        period: 1w
+        prefix: dev_chunks_
+    index:
+        period: 1w
+        prefix: dev_index_
+    store: gcp-columnkey
+
+  # Starting 2019-02-13 we moved from BigTable to GCS for storing the chunks.
+  - from: "2019-02-13"
+    schema: v9
+    chunks:
+        period: 1w
+        prefix: dev_chunks_
+    index:
+        period: 1w
+        prefix: dev_index_
+    object_store: gcs
+    store: gcp-columnkey
+
+  # Starting 2019-02-24 we moved our index from bigtable-columnkey to bigtable-hashed
+  # which improves the distribution of keys.
+  - from: "2019-02-24"
+    schema: v9
+    chunks:
+        period: 1w
+        prefix: dev_chunks_
+    index:
+        period: 1w
+        prefix: dev_index_
+    object_store: gcs
+    store: bigtable-hashed
+
+  # Starting 2019-03-05 we moved from v9 schema to v10 schema.
+  - from: "2019-03-05"
+    schema: v10
+    chunks:
+        period: 1w
+        prefix: dev_chunks_
+    index:
+        period: 1w
+        prefix: dev_index_
+    object_store: gcs
+    store: bigtable-hashed
+```