From 72820b3f41794140403fd04d6da82299f2c16447 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Bartek=20P=C5=82otka?= Date: Thu, 6 Jun 2019 11:39:37 +0100 Subject: [PATCH] Cut release 0.5.0 on top of v0.5.0-rc.0; cherrypicked docs. (#1226) * Added extra docs about updating Golang version. (#1209) Signed-off-by: Bartek Plotka * Explained sidecar better in docs. (#1214) Signed-off-by: Bartek Plotka * Cut release v0.5.0. Signed-off-by: Bartek Plotka --- .github/PULL_REQUEST_TEMPLATE.md | 2 + CHANGELOG.md | 18 ++++--- VERSION | 2 +- docs/components/sidecar.md | 53 ++++++++++++------- docs/contributing/how-to-change-go-version.md | 16 ++++++ .../201809_gossip-removal.md | 2 +- docs/release-process.md | 7 ++- 7 files changed, 73 insertions(+), 27 deletions(-) create mode 100644 docs/contributing/how-to-change-go-version.md rename docs/proposals/{approved => completed}/201809_gossip-removal.md (99%) diff --git a/.github/PULL_REQUEST_TEMPLATE.md b/.github/PULL_REQUEST_TEMPLATE.md index d4e4d52649..1882e3347c 100644 --- a/.github/PULL_REQUEST_TEMPLATE.md +++ b/.github/PULL_REQUEST_TEMPLATE.md @@ -3,6 +3,8 @@ about what components it touches e.g "query:" or ".*:" --> +* [] CHANGELOG entry if change is relevant to the end user. + ## Changes diff --git a/CHANGELOG.md b/CHANGELOG.md index b6c96394e0..96a0d652ff 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -9,10 +9,16 @@ NOTE: As semantic versioning states all 0.y.z releases can contain breaking chan We use *breaking* word for marking changes that are not backward compatible (relates only to v0.y.z releases.) -## [v0.5.0-rc.0](https://github.com/improbable-eng/thanos/releases/tag/v0.5.0-rc.0) - 2019.05.30 +## Unreleased. + +## [v0.5.0](https://github.com/improbable-eng/thanos/releases/tag/v0.5.0) - 2019.06.05 TL;DR: Store LRU cache is no longer leaking, Upgraded Thanos UI to Prometheus 2.9, Fixed auto-downsampling, Moved to Go 1.12.5 and more. +This version moved tarballs to Golang 1.12.5 from 1.11 as well, so same warning applies if you use `container_memory_usage_bytes` from cadvisor. Use `container_memory_working_set_bytes` instead. + +*breaking* As announced couple of times this release also removes gossip with all configuration flags (`--cluster.*`). + ### Fixed - [#1142](https://github.com/improbable-eng/thanos/pull/1142) fixed major leak on store LRU cache for index items (postings and series). @@ -65,16 +71,16 @@ TL;DR: Store LRU cache is no longer leaking, Upgraded Thanos UI to Prometheus 2. ## Deprecated -- [#1008](https://github.com/improbable-eng/thanos/pull/1008) Removed Gossip implementation. +- [#1008](https://github.com/improbable-eng/thanos/pull/1008) *breaking* Removed Gossip implementation. All `--cluster.*` flags removed and Thanos will error out if any is provided. ## [v0.4.0](https://github.com/improbable-eng/thanos/releases/tag/v0.4.0) - 2019.05.3 :warning: **IMPORTANT** :warning: This is the last release that supports gossip. From Thanos v0.5.0, gossip will be completely removed. This release also disables gossip mode by default for all components. -See [this](docs/proposals/approved/201809_gossip-removal.md) for more details. +See [this](docs/proposals/completed/201809_gossip-removal.md) for more details. -:warning: This release moves Thanos docker images and artifacts to Golang 1.12. This release includes change in GC's memory release which gives following effect (source: https://golang.org/doc/go1.12): +:warning: This release moves Thanos docker images (NOT artifacts by accident) to Golang 1.12. This release includes change in GC's memory release which gives following effect (source: https://golang.org/doc/go1.12): > On Linux, the runtime now uses MADV_FREE to release unused memory. This is more efficient but may result in higher reported RSS. The kernel will reclaim the unused data when it is needed. To revert to the Go 1.11 behavior (MADV_DONTNEED), set the environment variable GODEBUG=madvdontneed=1. @@ -170,7 +176,7 @@ Note that this is required to have SRV resolution working on [Golang 1.11+ with * tooling: [FEATURE] New dump command to tsdb tool to dump all samples. * compactor: * [ENHANCEMENT] When closing the db any running compaction will be cancelled so it doesn't block. - * [CHANGE] Renamed flag `--sync-delay` to `--consistency-delay` [#1053](https://github.com/improbable-eng/thanos/pull/1053) + * [CHANGE] *breaking* Renamed flag `--sync-delay` to `--consistency-delay` [#1053](https://github.com/improbable-eng/thanos/pull/1053) For ruler essentially whole TSDB CHANGELOG applies beween v0.4.0-v0.6.1: https://github.com/prometheus/tsdb/blob/master/CHANGELOG.md @@ -349,7 +355,7 @@ Note lots of necessary breaking changes in flags that relates to bucket configur ## [v0.1.0](https://github.com/improbable-eng/thanos/releases/tag/v0.1.0) - 2018.09.14 -Initial version to have a stable reference before [gossip protocol removal](https://github.com/improbable-eng/thanos/blob/master/docs/proposals/gossip-removal.md). +Initial version to have a stable reference before [gossip protocol removal](/docs/proposals/completed/201809_gossip-removal.md). ### Added diff --git a/VERSION b/VERSION index 57d1946355..79a2734bbf 100644 --- a/VERSION +++ b/VERSION @@ -1 +1 @@ -0.5.0-rc.0 \ No newline at end of file +0.5.0 \ No newline at end of file diff --git a/docs/components/sidecar.md b/docs/components/sidecar.md index 57e0894ba7..13810fe182 100644 --- a/docs/components/sidecar.md +++ b/docs/components/sidecar.md @@ -6,19 +6,44 @@ menu: components # Sidecar -The sidecar component of Thanos gets deployed along with a Prometheus instance. It implements Thanos' Store API on top of Prometheus' remote-read API and advertises itself as a data source to the cluster. Thereby queriers in the cluster can treat Prometheus servers as yet another source of time series data without directly talking to its APIs. -Additionally, the sidecar uploads TSDB blocks to an object storage bucket as Prometheus produces them. This allows Prometheus servers to be run with relatively low retention while their historic data is made durable and queryable via object storage. +The sidecar component of Thanos gets deployed along with a Prometheus instance. This allows sidecar to optionally upload metrics to object storage and allow [Queriers](./query.md) to query Prometheus data with common, efficient StoreAPI. -Prometheus servers connected to the Thanos cluster via the sidecar are subject to a few limitations for safe operations: +In details: -* The minimum Prometheus version is 2.2.1 -* The `external_labels` section of the configuration implements is in line with the desired label scheme (will be used by query layer to filter out store APIs to query). -* The `--web.enable-lifecycle` flag is enabled if you want to use `reload.*` flags. -* The `--storage.tsdb.min-block-duration` and `--storage.tsdb.max-block-duration` must be set to equal values to disable local compaction on order to use Thanos sidecar upload. Leave local compaction on if sidecar just exposes StoreAPI and your retention is normal. The default of `2h` is recommended. - Mentioned parameters set to equal values disable the internal Prometheus compaction, which is needed to avoid the uploaded data corruption when thanos compactor does its job, this is critical for data consistency and should not be ignored if you plan to use Thanos compactor. Even though you set mentioned parameters equal, you might observe Prometheus internal metric `prometheus_tsdb_compactions_total` being incremented, don't be confused by that: Prometheus writes initial head block to filestem via internal compaction mechanis, but if you followed recommendations - data won't be modified by Prometheus before sidecar uploads it. Thanos sidecar will also check sanity of the flags set to Prometheus on the startup and log errors or warning if they have been configured improperly (#838). +* It implements Thanos' Store API on top of Prometheus' remote-read API. This allows [Queriers](./query.md) to treat Prometheus servers as yet another source of time series data without directly talking to its APIs. +* Optionally, the sidecar uploads TSDB blocks to an object storage bucket as Prometheus produces them every 2 hours. This allows Prometheus servers to be run with relatively low retention while their historic data is made durable and queryable via object storage. -The retention is recommended to not be lower than three times the block duration. This achieves resilience in the face of connectivity issues -to the object storage since all local data will remain available within the Thanos cluster. If connectivity gets restored the backlog of blocks gets uploaded to the object storage. + NOTE: This still does NOT mean that Prometheus can be fully stateless, because if it crashes and restarts you will lose ~2 hours of metrics, so persistent disk for Prometheus is highly recommended. The closest to stateless you can get is using remote write (which Thanos experimentally supports, see [this](../proposals/approved/201812_thanos-remote-receive.md). Remote write has other risks and consequences, and still if crashed you loose in positive case seconds of metrics data, so persistent disk is recommended in all cases. + +* Optionally Thanos sidecar is able to watch Prometheus rules and configuration, decompress and substitute environment variables if needed and ping Prometheus to reload them. Read more about this in [here](./query.md#reloader-configuration) + + +Prometheus servers connected to the Thanos cluster via the sidecar are subject to a few limitations and recommendations for safe operations: + +* The recommended Prometheus version is 2.2.1 or greater (including newest releases). This is due to Prometheus instability in previous versions as well as lack of `flags` endpoint. +* (!) The Prometheus `external_labels` section of the Prometheus configuration file has unique labels in the overall Thanos system. Those external labels will be used by sidecar and then Thanos in many places: + + * [Querier](./query.md) to filter out store APIs to touch during query requests. + * Many object storage readers like [compactor](./compact.md) and [store gateway](./store.md) which groups the blocks by Prometheus source. Each produced TSDB block by Prometheus is labelled with external label by sidecar before upload to object storage. + +* The `--web.enable-lifecycle` flag is enabled if you want to use sidecar reloading features (`--reload.*` flags). + +If you choose to use the sidecar to also upload to object storage: + +* The `--storage.tsdb.min-block-duration` and `--storage.tsdb.max-block-duration` must be set to equal values to disable local compaction on order to use Thanos sidecar upload, otherwise leave local compaction on if sidecar just exposes StoreAPI and your retention is normal. The default of `2h` is recommended. + Mentioned parameters set to equal values disable the internal Prometheus compaction, which is needed to avoid the uploaded data corruption when Thanos compactor does its job, this is critical for data consistency and should not be ignored if you plan to use Thanos compactor. Even though you set mentioned parameters equal, you might observe Prometheus internal metric `prometheus_tsdb_compactions_total` being incremented, don't be confused by that: Prometheus writes initial head block to filesytem via internal compaction mechanism, but if you have followed recommendations - data won't be modified by Prometheus before sidecar uploads it. Thanos sidecar will also check sanity of the flags set to Prometheus on the startup and log errors or warning if they have been configured improperly (#838). +* The retention is recommended to not be lower than three times the min block duration, so 6 hours. This achieves resilience in the face of connectivity issues to the object storage since all local data will remain available within the Thanos cluster. If connectivity gets restored the backlog of blocks gets uploaded to the object storage. + +## Reloader Configuration + +Thanos can watch changes in Prometheus configuration and refresh Prometheus configuration if `--web.enable-lifecycle` enabled. + +You can configure watching for changes in directory via `--reloader.rule-dir=DIR_NAME` flag. + +Thanos sidecar can watch `--reloader.config-file=CONFIG_FILE` configuration file, evaluate environment variables found in there and produce generated config in `--reloader.config-envsubst-file=OUT_CONFIG_FILE` file. + + +## Example basic deployment ```bash $ prometheus \ @@ -97,11 +122,3 @@ Flags: Object store configuration in YAML. ``` - -## Reloader Configuration - -Thanos can watch changes in Prometheus configuration and refresh Prometheus configuration if `--web.enable-lifecycle` enabled. - -You can configure watching for changes in directory via `--reloader.rule-dir=DIR_NAME` flag. - -Thanos sidecar can watch `--reloader.config-file=CONFIG_FILE` configuration file, evaluate environment variables found in there and produce generated config in `--reloader.config-envsubst-file=OUT_CONFIG_FILE` file. diff --git a/docs/contributing/how-to-change-go-version.md b/docs/contributing/how-to-change-go-version.md new file mode 100644 index 0000000000..8804263c26 --- /dev/null +++ b/docs/contributing/how-to-change-go-version.md @@ -0,0 +1,16 @@ +--- +title: Changing Golang version +type: docs +menu: contributing +slug: /how-to-change-go-version.md +--- + +Thanos build system is pinned to certain Golang version. This is to ensure that Golang version +changes is done by us in controlled, traceable way. + +To update Thanos build system to newer Golang: + +1. Edit [.promu.yaml](/.promu.yml) and edit `go: version: ` in YAML to desired version. This will ensure that all artifacts are + built with desired Golang version. How to verify? Download tarball, unpack and invoke `thanos --version` +1. Edit [.circleci/config.yaml](/.circleci/config.yml) and edit ` - image: circleci/golang:` to desired + Golang version. This will ensure that all docker images and go tests are using desired Golang version. How to verify? Invoke `docker pull improbable/thanos: --version` \ No newline at end of file diff --git a/docs/proposals/approved/201809_gossip-removal.md b/docs/proposals/completed/201809_gossip-removal.md similarity index 99% rename from docs/proposals/approved/201809_gossip-removal.md rename to docs/proposals/completed/201809_gossip-removal.md index 2c3b4de4d1..725a2d081c 100644 --- a/docs/proposals/approved/201809_gossip-removal.md +++ b/docs/proposals/completed/201809_gossip-removal.md @@ -2,7 +2,7 @@ title: Deprecated gossip clustering in favor of File SD type: proposal menu: proposals -status: accepted +status: completed owner: bwplotka --- diff --git a/docs/release-process.md b/docs/release-process.md index 995ddd3ef4..42cad943f1 100644 --- a/docs/release-process.md +++ b/docs/release-process.md @@ -1,4 +1,9 @@ -# Releases +--- +title: Release Process +type: docs +menu: thanos +slug: /release-process.md +--- This page describes the release cadence and process for Thanos project.