Skip to content

Commit

Permalink
Cut release 0.5.0 on top of v0.5.0-rc.0; cherrypicked docs. (#1226)
Browse files Browse the repository at this point in the history
* Added extra docs about updating Golang version. (#1209)

Signed-off-by: Bartek Plotka <[email protected]>

* Explained sidecar better in docs. (#1214)

Signed-off-by: Bartek Plotka <[email protected]>

* Cut release v0.5.0.

Signed-off-by: Bartek Plotka <[email protected]>
  • Loading branch information
bwplotka authored Jun 6, 2019
1 parent d5441f2 commit 72820b3
Show file tree
Hide file tree
Showing 7 changed files with 73 additions and 27 deletions.
2 changes: 2 additions & 0 deletions .github/PULL_REQUEST_TEMPLATE.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,8 @@
about what components it touches e.g "query:" or ".*:"
-->

* [] CHANGELOG entry if change is relevant to the end user.

## Changes

<!-- Enumerate changes you made -->
Expand Down
18 changes: 12 additions & 6 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,10 +9,16 @@ NOTE: As semantic versioning states all 0.y.z releases can contain breaking chan

We use *breaking* word for marking changes that are not backward compatible (relates only to v0.y.z releases.)

## [v0.5.0-rc.0](https://github.com/improbable-eng/thanos/releases/tag/v0.5.0-rc.0) - 2019.05.30
## Unreleased.

## [v0.5.0](https://github.com/improbable-eng/thanos/releases/tag/v0.5.0) - 2019.06.05

TL;DR: Store LRU cache is no longer leaking, Upgraded Thanos UI to Prometheus 2.9, Fixed auto-downsampling, Moved to Go 1.12.5 and more.

This version moved tarballs to Golang 1.12.5 from 1.11 as well, so same warning applies if you use `container_memory_usage_bytes` from cadvisor. Use `container_memory_working_set_bytes` instead.

*breaking* As announced couple of times this release also removes gossip with all configuration flags (`--cluster.*`).

### Fixed

- [#1142](https://github.com/improbable-eng/thanos/pull/1142) fixed major leak on store LRU cache for index items (postings and series).
Expand Down Expand Up @@ -65,16 +71,16 @@ TL;DR: Store LRU cache is no longer leaking, Upgraded Thanos UI to Prometheus 2.

## Deprecated

- [#1008](https://github.com/improbable-eng/thanos/pull/1008) Removed Gossip implementation.
- [#1008](https://github.com/improbable-eng/thanos/pull/1008) *breaking* Removed Gossip implementation. All `--cluster.*` flags removed and Thanos will error out if any is provided.

## [v0.4.0](https://github.com/improbable-eng/thanos/releases/tag/v0.4.0) - 2019.05.3

:warning: **IMPORTANT** :warning: This is the last release that supports gossip. From Thanos v0.5.0, gossip will be completely removed.

This release also disables gossip mode by default for all components.
See [this](docs/proposals/approved/201809_gossip-removal.md) for more details.
See [this](docs/proposals/completed/201809_gossip-removal.md) for more details.

:warning: This release moves Thanos docker images and artifacts to Golang 1.12. This release includes change in GC's memory release which gives following effect (source: https://golang.org/doc/go1.12):
:warning: This release moves Thanos docker images (NOT artifacts by accident) to Golang 1.12. This release includes change in GC's memory release which gives following effect (source: https://golang.org/doc/go1.12):

> On Linux, the runtime now uses MADV_FREE to release unused memory. This is more efficient but may result in higher reported RSS. The kernel will reclaim the unused data when it is needed. To revert to the Go 1.11 behavior (MADV_DONTNEED), set the environment variable GODEBUG=madvdontneed=1.
Expand Down Expand Up @@ -170,7 +176,7 @@ Note that this is required to have SRV resolution working on [Golang 1.11+ with
* tooling: [FEATURE] New dump command to tsdb tool to dump all samples.
* compactor:
* [ENHANCEMENT] When closing the db any running compaction will be cancelled so it doesn't block.
* [CHANGE] Renamed flag `--sync-delay` to `--consistency-delay` [#1053](https://github.com/improbable-eng/thanos/pull/1053)
* [CHANGE] *breaking* Renamed flag `--sync-delay` to `--consistency-delay` [#1053](https://github.com/improbable-eng/thanos/pull/1053)

For ruler essentially whole TSDB CHANGELOG applies beween v0.4.0-v0.6.1: https://github.com/prometheus/tsdb/blob/master/CHANGELOG.md

Expand Down Expand Up @@ -349,7 +355,7 @@ Note lots of necessary breaking changes in flags that relates to bucket configur

## [v0.1.0](https://github.com/improbable-eng/thanos/releases/tag/v0.1.0) - 2018.09.14

Initial version to have a stable reference before [gossip protocol removal](https://github.com/improbable-eng/thanos/blob/master/docs/proposals/gossip-removal.md).
Initial version to have a stable reference before [gossip protocol removal](/docs/proposals/completed/201809_gossip-removal.md).

### Added

Expand Down
2 changes: 1 addition & 1 deletion VERSION
Original file line number Diff line number Diff line change
@@ -1 +1 @@
0.5.0-rc.0
0.5.0
53 changes: 35 additions & 18 deletions docs/components/sidecar.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,19 +6,44 @@ menu: components

# Sidecar

The sidecar component of Thanos gets deployed along with a Prometheus instance. It implements Thanos' Store API on top of Prometheus' remote-read API and advertises itself as a data source to the cluster. Thereby queriers in the cluster can treat Prometheus servers as yet another source of time series data without directly talking to its APIs.
Additionally, the sidecar uploads TSDB blocks to an object storage bucket as Prometheus produces them. This allows Prometheus servers to be run with relatively low retention while their historic data is made durable and queryable via object storage.
The sidecar component of Thanos gets deployed along with a Prometheus instance. This allows sidecar to optionally upload metrics to object storage and allow [Queriers](./query.md) to query Prometheus data with common, efficient StoreAPI.

Prometheus servers connected to the Thanos cluster via the sidecar are subject to a few limitations for safe operations:
In details:

* The minimum Prometheus version is 2.2.1
* The `external_labels` section of the configuration implements is in line with the desired label scheme (will be used by query layer to filter out store APIs to query).
* The `--web.enable-lifecycle` flag is enabled if you want to use `reload.*` flags.
* The `--storage.tsdb.min-block-duration` and `--storage.tsdb.max-block-duration` must be set to equal values to disable local compaction on order to use Thanos sidecar upload. Leave local compaction on if sidecar just exposes StoreAPI and your retention is normal. The default of `2h` is recommended.
Mentioned parameters set to equal values disable the internal Prometheus compaction, which is needed to avoid the uploaded data corruption when thanos compactor does its job, this is critical for data consistency and should not be ignored if you plan to use Thanos compactor. Even though you set mentioned parameters equal, you might observe Prometheus internal metric `prometheus_tsdb_compactions_total` being incremented, don't be confused by that: Prometheus writes initial head block to filestem via internal compaction mechanis, but if you followed recommendations - data won't be modified by Prometheus before sidecar uploads it. Thanos sidecar will also check sanity of the flags set to Prometheus on the startup and log errors or warning if they have been configured improperly (#838).
* It implements Thanos' Store API on top of Prometheus' remote-read API. This allows [Queriers](./query.md) to treat Prometheus servers as yet another source of time series data without directly talking to its APIs.
* Optionally, the sidecar uploads TSDB blocks to an object storage bucket as Prometheus produces them every 2 hours. This allows Prometheus servers to be run with relatively low retention while their historic data is made durable and queryable via object storage.

The retention is recommended to not be lower than three times the block duration. This achieves resilience in the face of connectivity issues
to the object storage since all local data will remain available within the Thanos cluster. If connectivity gets restored the backlog of blocks gets uploaded to the object storage.
NOTE: This still does NOT mean that Prometheus can be fully stateless, because if it crashes and restarts you will lose ~2 hours of metrics, so persistent disk for Prometheus is highly recommended. The closest to stateless you can get is using remote write (which Thanos experimentally supports, see [this](../proposals/approved/201812_thanos-remote-receive.md). Remote write has other risks and consequences, and still if crashed you loose in positive case seconds of metrics data, so persistent disk is recommended in all cases.

* Optionally Thanos sidecar is able to watch Prometheus rules and configuration, decompress and substitute environment variables if needed and ping Prometheus to reload them. Read more about this in [here](./query.md#reloader-configuration)


Prometheus servers connected to the Thanos cluster via the sidecar are subject to a few limitations and recommendations for safe operations:

* The recommended Prometheus version is 2.2.1 or greater (including newest releases). This is due to Prometheus instability in previous versions as well as lack of `flags` endpoint.
* (!) The Prometheus `external_labels` section of the Prometheus configuration file has unique labels in the overall Thanos system. Those external labels will be used by sidecar and then Thanos in many places:

* [Querier](./query.md) to filter out store APIs to touch during query requests.
* Many object storage readers like [compactor](./compact.md) and [store gateway](./store.md) which groups the blocks by Prometheus source. Each produced TSDB block by Prometheus is labelled with external label by sidecar before upload to object storage.

* The `--web.enable-lifecycle` flag is enabled if you want to use sidecar reloading features (`--reload.*` flags).

If you choose to use the sidecar to also upload to object storage:

* The `--storage.tsdb.min-block-duration` and `--storage.tsdb.max-block-duration` must be set to equal values to disable local compaction on order to use Thanos sidecar upload, otherwise leave local compaction on if sidecar just exposes StoreAPI and your retention is normal. The default of `2h` is recommended.
Mentioned parameters set to equal values disable the internal Prometheus compaction, which is needed to avoid the uploaded data corruption when Thanos compactor does its job, this is critical for data consistency and should not be ignored if you plan to use Thanos compactor. Even though you set mentioned parameters equal, you might observe Prometheus internal metric `prometheus_tsdb_compactions_total` being incremented, don't be confused by that: Prometheus writes initial head block to filesytem via internal compaction mechanism, but if you have followed recommendations - data won't be modified by Prometheus before sidecar uploads it. Thanos sidecar will also check sanity of the flags set to Prometheus on the startup and log errors or warning if they have been configured improperly (#838).
* The retention is recommended to not be lower than three times the min block duration, so 6 hours. This achieves resilience in the face of connectivity issues to the object storage since all local data will remain available within the Thanos cluster. If connectivity gets restored the backlog of blocks gets uploaded to the object storage.

## Reloader Configuration

Thanos can watch changes in Prometheus configuration and refresh Prometheus configuration if `--web.enable-lifecycle` enabled.

You can configure watching for changes in directory via `--reloader.rule-dir=DIR_NAME` flag.

Thanos sidecar can watch `--reloader.config-file=CONFIG_FILE` configuration file, evaluate environment variables found in there and produce generated config in `--reloader.config-envsubst-file=OUT_CONFIG_FILE` file.


## Example basic deployment

```bash
$ prometheus \
Expand Down Expand Up @@ -97,11 +122,3 @@ Flags:
Object store configuration in YAML.

```

## Reloader Configuration

Thanos can watch changes in Prometheus configuration and refresh Prometheus configuration if `--web.enable-lifecycle` enabled.

You can configure watching for changes in directory via `--reloader.rule-dir=DIR_NAME` flag.

Thanos sidecar can watch `--reloader.config-file=CONFIG_FILE` configuration file, evaluate environment variables found in there and produce generated config in `--reloader.config-envsubst-file=OUT_CONFIG_FILE` file.
16 changes: 16 additions & 0 deletions docs/contributing/how-to-change-go-version.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
---
title: Changing Golang version
type: docs
menu: contributing
slug: /how-to-change-go-version.md
---

Thanos build system is pinned to certain Golang version. This is to ensure that Golang version
changes is done by us in controlled, traceable way.

To update Thanos build system to newer Golang:

1. Edit [.promu.yaml](/.promu.yml) and edit `go: version: <go version>` in YAML to desired version. This will ensure that all artifacts are
built with desired Golang version. How to verify? Download tarball, unpack and invoke `thanos --version`
1. Edit [.circleci/config.yaml](/.circleci/config.yml) and edit ` - image: circleci/golang:<go version>` to desired
Golang version. This will ensure that all docker images and go tests are using desired Golang version. How to verify? Invoke `docker pull improbable/thanos:<version> --version`
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
title: Deprecated gossip clustering in favor of File SD
type: proposal
menu: proposals
status: accepted
status: completed
owner: bwplotka
---

Expand Down
7 changes: 6 additions & 1 deletion docs/release-process.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,9 @@
# Releases
---
title: Release Process
type: docs
menu: thanos
slug: /release-process.md
---

This page describes the release cadence and process for Thanos project.

Expand Down

0 comments on commit 72820b3

Please sign in to comment.