Releases: thanos-io/thanos
v0.30.0
v0.30 brings many important fixes & optimizations to compaction, store gateway, receive replication and querying. Make sure to try the new PromQL engine which is more & more efficient every week.
NOTE: Querier's
query.promql-engine
flag enabling the new PromQL engine is now unhidden. We encourage users to use new experimental PromQL engine for efficiency reasons.
Furthermore, we recommend you use Redis as a caching client (if you use store GW or query frontend caching) and Ketama algorithm as receiver hashing algorithm ( --receive.hashrings-algorithm=ketama
- introducing consistent hashing to receiver).
Changes
Fixed
- #5716 DNS: Fix miekgdns resolver LookupSRV to work with CNAME records.
- #5844 Query Frontend: Fixes @ modifier time range when splitting queries by interval.
- #5854 Query Frontend:
lookback_delta
param is now handled in query frontend. - #5860 Query: Fixed bug of not showing query warnings in Thanos UI.
- #5856 Store: Fixed handling of debug logging flag.
- #5230 Rule: Stateless ruler support restoring
for
state from query API servers. The query API servers should be able to access the remote write storage. - #5880 Query Frontend: Fixes some edge cases of query sharding analysis.
- #5893 Cache: Fixed redis client not respecting
SetMultiBatchSize
config value. - #5966 Query: Stop relying on non-existent hints for mint and maxt when selecting series for the
api/v1/series
HTTP endpoint. - #5948 Store:
chunks_fetched_duration
wrong calculation. - #5910: Receive: Fixed ketama quorum bug that was could cause success response for failed replication. This also optimize heavily receiver CPU use.
Added
- #5814 Store: Added metric
thanos_bucket_store_postings_size_bytes
that shows the distribution of how many postings (in bytes) were needed for each Series() call in Thanos Store. Useful for determining limits. - #5703 StoreAPI: Added
hash
field to series' chunks. Store gateway and receive implements that field and proxy leverage that for quicker deduplication. - #5801 Store: Added a new flag
--store.grpc.downloaded-bytes-limit
that limits the number of bytes downloaded in each Series/LabelNames/LabelValues call. Usethanos_bucket_store_postings_size_bytes
for determining the limits. - #5836 Receive: Added hidden flag
tsdb.memory-snapshot-on-shutdown
to enable experimental TSDB feature to snapshot on shutdown. This is intended to speed up receiver restart. - #5839 Receive: Added parameter
--tsdb.out-of-order.time-window
to set time window for experimental out-of-order samples ingestion. Disabled by default (set to 0s). Please note if you enable this option and you use compactor, make sure you set the--enable-vertical-compaction
flag, otherwise you might risk compactor halt. - #5889 Query Frontend: Added support for vertical sharding
label_replace
andlabel_join
functions. - #5865 Compact: Retry on sync metas error.
- #5819 Store: Added a few objectives for Store's data summaries (touched/fetched amount and sizes). They are: 50, 95, and 99 quantiles.
- #5837 Store: Added streaming retrival of series from object storage.
- #5940 Objstore: Support for authenticating to Swift using application credentials.
- #5945 Tools: Added new
no-downsample
marker to skip blocks when downsampling viathanos tools bucket mark --marker=no-downsample-mark.json
. This will skip downsampling for blocks with the new marker. - #5977 Tools: Added remove flag on bucket mark command to remove deletion, no-downsample or no-compact markers on the block
Changed
- #5785 Query:
thanos_store_nodes_grpc_connections
now trimmsexternal_labels
label name longer than 1000 character. It also allows customizations in what labels to preserve usingquery.conn-metric.label
flag. - #5542 Mixin: Added query concurrency panel to Querier dashboard.
- #5846 Query Frontend: vertical query sharding supports subqueries.
- #5909 Receive: Compact tenant head after no appends have happened for 1.5
tsdb.max-block-size
. - #5593 Cache: Switched Redis client to Rueidis. Rueidis is faster and provides client-side caching. It is highly recommended to use it so that repeated requests for the same key would not be needed.
- #5896 *: Upgraded Prometheus to v0.40.7 without implementing native histogram support. Querying native histograms will fail with
Error executing query: invalid chunk encoding "<unknown>"
and native histograms in write requests are ignored. - #5838 Mixin: Added data touched type to Store dashboard.
- #5922 Compact: Retry on clean, partial marked errors when possible.
Removed
- #5824 Mixin: Remove noisy
ThanosReceiveTrafficBelowThreshold
alert.
New Contributors
- @rajivharlalka made their first contribution in #5631
- @Atharva-Shinde made their first contribution in #5716
- @clwluvw made their first contribution in #5856
- @VicThomas made their first contribution in #5884
- @karster made their first contribution in #5886
- @sumanpaikdev made their first contribution in #5868
- @abbyssoul made their first contribution in #5893
- @juanrh made their first contribution in #5795
- @hyder made their first contribution in #5928
- @aarnq made their first contribution in #5940
- @4orty made their first contribution in #5953
- @jatinagwal made their first contribution in #5967
- @RohitKochhar made their first contribution in #5945
- @rabenhorst made their first contribution in #5896
- @kama910 made their first contribution in #5981
- @Vishvsalvi made their first contribution in #5979
- @maheshbaliga made their first contribution in #5977
Commits
- CHANGELOG: mark 0.29.0 as in progress by @GiedriusS in #5808
- store: add histogram for postings size by @GiedriusS in #5814
- Store/Receivers: Calculating chunk hashes on stores/receivers by @pedro-stanaka in #5703
- Use pre-calculated hashes by @fpetkovski in #5817
- Short-circuit chunk dedup in proxy by @fpetkovski in #5816
- deps: Updated promql-engine to latest. by @bwplotka in #5821
- Query: Trim very long external labels and add cmd flag to optionally specify metric labels to collect by @utukJ in #5785
- CircleCI: Replace checkout step with custom command by @matej-g in #5829
- store: add downloaded bytes limit by @GiedriusS in #5801
- Mixin: Remove low ingestion rate warning for receiver by @matej-g in #5824
- Mixin: Remove low ingestion rate warning for receiver (fix tests) by @matej-g in #5831
- Fix Typo's in recieve.md by @rajivharlalka in #5631
- add panel Query Concurrency to dashboard mixin. by @raptorsun in #5542
- docs: Added guide for Community Office Hours shepherding. by @bwplotka in #5568
- *: Clean up stale bot config file by @matej-g in #5834
- Receive: Add experimental snapshot on shutdown by @matej-g in #5836
- Feature...
v0.30.0-rc.0
v0.30 brings many important fixes & optimizations to compaction, store gateway, receive replication and querying. Make sure to try the new PromQL engine which is more & more efficient every week.
NOTE: Querier's
query.promql-engine
flag enabling the new PromQL engine is now unhidden. We encourage users to use new experimental PromQL engine for efficiency reasons.
Furthermore, we recommend you use Redis as a caching client (if you use store GW or query frontend caching) and Ketama algorithm as receiver hashing algorithm ( --receive.hashrings-algorithm=ketama
- introducing consistent hashing to receiver).
Enjoy & Happy Christmas Holidays! 🎉
Changes
Fixed
- #5716 DNS: Fix miekgdns resolver LookupSRV to work with CNAME records.
- #5844 Query Frontend: Fixes @ modifier time range when splitting queries by interval.
- #5854 Query Frontend:
lookback_delta
param is now handled in query frontend. - #5860 Query: Fixed bug of not showing query warnings in Thanos UI.
- #5856 Store: Fixed handling of debug logging flag.
- #5230 Rule: Stateless ruler support restoring
for
state from query API servers. The query API servers should be able to access the remote write storage. - #5880 Query Frontend: Fixes some edge cases of query sharding analysis.
- #5893 Cache: Fixed redis client not respecting
SetMultiBatchSize
config value. - #5966 Query: Stop relying on non-existent hints for mint and maxt when selecting series for the
api/v1/series
HTTP endpoint. - #5948 Store:
chunks_fetched_duration
wrong calculation. - #5910: Receive: Fixed ketama quorum bug that was could cause success response for failed replication. This also optimize heavily receiver CPU use.
Added
- #5814 Store: Added metric
thanos_bucket_store_postings_size_bytes
that shows the distribution of how many postings (in bytes) were needed for each Series() call in Thanos Store. Useful for determining limits. - #5703 StoreAPI: Added
hash
field to series' chunks. Store gateway and receive implements that field and proxy leverage that for quicker deduplication. - #5801 Store: Added a new flag
--store.grpc.downloaded-bytes-limit
that limits the number of bytes downloaded in each Series/LabelNames/LabelValues call. Usethanos_bucket_store_postings_size_bytes
for determining the limits. - #5836 Receive: Added hidden flag
tsdb.memory-snapshot-on-shutdown
to enable experimental TSDB feature to snapshot on shutdown. This is intended to speed up receiver restart. - #5839 Receive: Added parameter
--tsdb.out-of-order.time-window
to set time window for experimental out-of-order samples ingestion. Disabled by default (set to 0s). Please note if you enable this option and you use compactor, make sure you set the--enable-vertical-compaction
flag, otherwise you might risk compactor halt. - #5889 Query Frontend: Added support for vertical sharding
label_replace
andlabel_join
functions. - #5865 Compact: Retry on sync metas error.
- #5819 Store: Added a few objectives for Store's data summaries (touched/fetched amount and sizes). They are: 50, 95, and 99 quantiles.
- #5837 Store: Added streaming retrival of series from object storage.
- #5940 Objstore: Support for authenticating to Swift using application credentials.
- #5945 Tools: Added new
no-downsample
marker to skip blocks when downsampling viathanos tools bucket mark --marker=no-downsample-mark.json
. This will skip downsampling for blocks with the new marker. - #5977 Tools: Added remove flag on bucket mark command to remove deletion, no-downsample or no-compact markers on the block
Changed
- #5785 Query:
thanos_store_nodes_grpc_connections
now trimmsexternal_labels
label name longer than 1000 character. It also allows customizations in what labels to preserve usingquery.conn-metric.label
flag. - #5542 Mixin: Added query concurrency panel to Querier dashboard.
- #5846 Query Frontend: vertical query sharding supports subqueries.
- #5909 Receive: Compact tenant head after no appends have happened for 1.5
tsdb.max-block-size
. - #5593 Cache: Switched Redis client to Rueidis. Rueidis is faster and provides client-side caching. It is highly recommended to use it so that repeated requests for the same key would not be needed.
- #5896 *: Upgraded Prometheus to v0.40.7 without implementing native histogram support. Querying native histograms will fail with
Error executing query: invalid chunk encoding "<unknown>"
and native histograms in write requests are ignored. - #5838 Mixin: Added data touched type to Store dashboard.
- #5922 Compact: Retry on clean, partial marked errors when possible.
Removed
- #5824 Mixin: Remove noisy
ThanosReceiveTrafficBelowThreshold
alert.
New Contributors
- @rajivharlalka made their first contribution in #5631
- @Atharva-Shinde made their first contribution in #5716
- @clwluvw made their first contribution in #5856
- @VicThomas made their first contribution in #5884
- @karster made their first contribution in #5886
- @sumanpaikdev made their first contribution in #5868
- @abbyssoul made their first contribution in #5893
- @juanrh made their first contribution in #5795
- @hyder made their first contribution in #5928
- @aarnq made their first contribution in #5940
- @4orty made their first contribution in #5953
- @jatinagwal made their first contribution in #5967
- @RohitKochhar made their first contribution in #5945
- @rabenhorst made their first contribution in #5896
- @kama910 made their first contribution in #5981
- @Vishvsalvi made their first contribution in #5979
- @maheshbaliga made their first contribution in #5977
Commits
- CHANGELOG: mark 0.29.0 as in progress by @GiedriusS in #5808
- store: add histogram for postings size by @GiedriusS in #5814
- Store/Receivers: Calculating chunk hashes on stores/receivers by @pedro-stanaka in #5703
- Use pre-calculated hashes by @fpetkovski in #5817
- Short-circuit chunk dedup in proxy by @fpetkovski in #5816
- deps: Updated promql-engine to latest. by @bwplotka in #5821
- Query: Trim very long external labels and add cmd flag to optionally specify metric labels to collect by @utukJ in #5785
- CircleCI: Replace checkout step with custom command by @matej-g in #5829
- store: add downloaded bytes limit by @GiedriusS in #5801
- Mixin: Remove low ingestion rate warning for receiver by @matej-g in #5824
- Mixin: Remove low ingestion rate warning for receiver (fix tests) by @matej-g in #5831
- Fix Typo's in recieve.md by @rajivharlalka in #5631
- add panel Query Concurrency to dashboard mixin. by @raptorsun in #5542
- docs: Added guide for Community Office Hours shepherding. by @bwplotka in #5568
- *: Clean up stale bot config file by @matej-g in #5834
- Receive: Add experimental snapshot on shutdown by @matej-g in https://github.co...
v0.29.0
v0.29.0
is out after 69 days of work since v0.28.0
! Thank you to all 35 contributors who have contributed to this release. It wouldn't be the same without you. v0.29.0
has no changes since the release candidate.
Some of the highlights include OpenTelemetry support, Azure support has been improved with a new SDK, increased query speed, receive has new features to limit series per tenant.
First, let's celebrate new contributors, and then you can find the changelog where you can find all of the details. Please try it out and let us know if you spot any problems!
New Contributors
- @nikitapecasa made their first contribution in #5448
- @shenxn made their first contribution in #5455
- @SrushtiSapkale made their first contribution in #5447
- @chris-ng-scmp made their first contribution in #5466
- @olasd made their first contribution in #5477
- @BouchaaraAdil made their first contribution in #5465
- @eharcevs made their first contribution in #5453
- @bishal7679 made their first contribution in #5486
- @naveensrinivasan made their first contribution in #5364
- @Firxiao made their first contribution in #5496
- @Akshit42-hue made their first contribution in #5529
- @audig made their first contribution in #5534
- @Juneezee made their first contribution in #5574
- @raptorsun made their first contribution in #5439
- @oronsh made their first contribution in #5596
- @jzelinskie made their first contribution in #5611
- @tusharxoxoxo made their first contribution in #5620
- @zvlb made their first contribution in #5573
- @padhiar-aditya made their first contribution in #5670
- @pedro-stanaka made their first contribution in #5666
- @prajain12 made their first contribution in #5678
- @xdavidwu made their first contribution in #5656
- @Abirdcfly made their first contribution in #5660
- @sdufel made their first contribution in #5684
- @mtlang made their first contribution in #5690
- @vhbfernandes made their first contribution in #5696
- @davinci26 made their first contribution in #5702
- @haanhvu made their first contribution in #5641
- @wanjunlei made their first contribution in #5723
- @dbut023 made their first contribution in #5674
- @utukJ made their first contribution in #5738
- @isantospardo made their first contribution in #5744
- @amincheloh made their first contribution in #5769
- @Rahulkumar2002 made their first contribution in #5749
- @aarontams made their first contribution in #5778
Fixed
- #5642 Receive: Log labels correctly in writer debug messages.
- #5655 Receive: Fix recreating already pruned tenants.
- #5702 Store: Upgrade minio-go/v7 to fix panic caused by leaked goroutines.
- #5736 Compact: Fix crash in GatherNoCompactionMarkFilter.NoCompactMarkedBlocks.
- #5763 Compact: Enable metadata cache.
- #5759 Compact: Fix missing duration log key.
- #5799 Query Frontend: Fixed sharding behaviour for vector matches. Now queries with sharding should work properly where the query looks like:
foo and without (lbl) bar
.
Added
- #5565 Receive: Allow remote write request limits to be defined per file and tenant (experimental).
- #5654 Query: add
--grpc-compression
flag that controls the compression used in gRPC client. With the flag it is now possible to compress the traffic between Query and StoreAPI nodes - you get lower network usage in exchange for a bit higher CPU/RAM usage.
- #5650 Query Frontend: Add sharded queries metrics.
thanos_frontend_sharding_middleware_queries_total
shows how many queries were sharded or not sharded. - #5658 Query Frontend: Introduce new optional parameters (
query-range.min-split-interval
,query-range.max-split-interval
,query-range.horizontal-shards
) to implement more dynamic horizontal query splitting. - #5721 Store: Add metric
thanos_bucket_store_empty_postings_total
for number of empty postings when fetching series. - #5723 Compactor: Support disable block viewer UI.
- #5674 Query Frontend/Store: Add support connecting to redis using TLS.
- #5734 Store: Support disable block viewer UI.
- #5411 Tracing: Add OpenTelemetry Protocol exporter.
- #5779 Objstore: Support specifying S3 storage class.
- #5741 Query: add metrics on how much data is being selected by downstream Store APIs.
- #5673 Receive: Reload tenant limit configuration on file change.
- #5749 Query Frontend: Added small LRU cache to cache query analysis results.
Changed
- #5738 Global: replace
crypto/sha256
withminio/sha256-simd
to make hash calculation faster in metadata and reloader packages. - #5648 Query Frontend: cache vertical shards in query-frontend.
- #5753 Build with Go 1.19.
- #5255 Query: Use k-way merging for the proxying logic. The proxying sub-system now uses much less resources (~25-80% less CPU usage, ~30-50% less RAM usage according to our benchmarks). Reduces query duration by a few percent on queries with lots of series.
- #5690 Compact: update
--debug.accept-malformed-index
flag to apply to downsampling. Previously the flag only applied to compaction, and fatal errors would still occur when downsampling was attempted. - #5707 Objstore: Update objstore to latest version which includes a refactored Azure Storage Account implementation with a new SDK.
- #5641 Store: Remove hardcoded labels in shard matcher.
- #5641 Query: Inject unshardable le label in query analyzer.
- #5685 Receive: Make active/head series limiting configuration per tenant by adding it to new limiting config.
- #5411 Tracing: Change Jaeger exporter from OpenTracing to OpenTelemetry. Options
RPC Metrics
,Gen128Bit
andDisabled
are now deprecated and won't have any effect when set⚠️ . - #5767 *: Upgrade Prometheus to v2.39.0.
- #5771 *: Upgrade Prometheus to v2.39.1.
Full Changelog: v0.28.1...v0.29.0
v0.29.0-rc.0
v0.29.0-rc.0
is out after 56 days of work since v0.28.0
! Thank you to all 35 contributors who have contributed to this release. It wouldn't be the same without you.
Some of the highlights include OpenTelemetry support, Azure support has been improved with a new SDK, increased query speed, receive has new features to limit series per tenant.
First, let's celebrate new contributors, and then you can find the changelog where you can find all of the details. Please try out this RC and let us know if you spot any problems!
New Contributors
- @nikitapecasa made their first contribution in #5448
- @shenxn made their first contribution in #5455
- @SrushtiSapkale made their first contribution in #5447
- @chris-ng-scmp made their first contribution in #5466
- @olasd made their first contribution in #5477
- @BouchaaraAdil made their first contribution in #5465
- @eharcevs made their first contribution in #5453
- @bishal7679 made their first contribution in #5486
- @naveensrinivasan made their first contribution in #5364
- @Firxiao made their first contribution in #5496
- @Akshit42-hue made their first contribution in #5529
- @audig made their first contribution in #5534
- @Juneezee made their first contribution in #5574
- @raptorsun made their first contribution in #5439
- @oronsh made their first contribution in #5596
- @jzelinskie made their first contribution in #5611
- @tusharxoxoxo made their first contribution in #5620
- @zvlb made their first contribution in #5573
- @padhiar-aditya made their first contribution in #5670
- @pedro-stanaka made their first contribution in #5666
- @prajain12 made their first contribution in #5678
- @xdavidwu made their first contribution in #5656
- @Abirdcfly made their first contribution in #5660
- @sdufel made their first contribution in #5684
- @mtlang made their first contribution in #5690
- @vhbfernandes made their first contribution in #5696
- @davinci26 made their first contribution in #5702
- @haanhvu made their first contribution in #5641
- @wanjunlei made their first contribution in #5723
- @dbut023 made their first contribution in #5674
- @utukJ made their first contribution in #5738
- @isantospardo made their first contribution in #5744
- @amincheloh made their first contribution in #5769
- @Rahulkumar2002 made their first contribution in #5749
- @aarontams made their first contribution in #5778
Fixed
- #5642 Receive: Log labels correctly in writer debug messages.
- #5655 Receive: Fix recreating already pruned tenants.
- #5702 Store: Upgrade minio-go/v7 to fix panic caused by leaked goroutines.
- #5736 Compact: Fix crash in GatherNoCompactionMarkFilter.NoCompactMarkedBlocks.
- #5763 Compact: Enable metadata cache.
- #5759 Compact: Fix missing duration log key.
- #5799 Query Frontend: Fixed sharding behaviour for vector matches. Now queries with sharding should work properly where the query looks like:
foo and without (lbl) bar
.
Added
- #5565 Receive: Allow remote write request limits to be defined per file and tenant (experimental).
- #5654 Query: add
--grpc-compression
flag that controls the compression used in gRPC client. With the flag it is now possible to compress the traffic between Query and StoreAPI nodes - you get lower network usage in exchange for a bit higher CPU/RAM usage.
- #5650 Query Frontend: Add sharded queries metrics.
thanos_frontend_sharding_middleware_queries_total
shows how many queries were sharded or not sharded. - #5658 Query Frontend: Introduce new optional parameters (
query-range.min-split-interval
,query-range.max-split-interval
,query-range.horizontal-shards
) to implement more dynamic horizontal query splitting. - #5721 Store: Add metric
thanos_bucket_store_empty_postings_total
for number of empty postings when fetching series. - #5723 Compactor: Support disable block viewer UI.
- #5674 Query Frontend/Store: Add support connecting to redis using TLS.
- #5734 Store: Support disable block viewer UI.
- #5411 Tracing: Add OpenTelemetry Protocol exporter.
- #5779 Objstore: Support specifying S3 storage class.
- #5741 Query: add metrics on how much data is being selected by downstream Store APIs.
- #5673 Receive: Reload tenant limit configuration on file change.
- #5749 Query Frontend: Added small LRU cache to cache query analysis results.
Changed
- #5738 Global: replace
crypto/sha256
withminio/sha256-simd
to make hash calculation faster in metadata and reloader packages. - #5648 Query Frontend: cache vertical shards in query-frontend.
- #5753 Build with Go 1.19.
- #5255 Query: Use k-way merging for the proxying logic. The proxying sub-system now uses much less resources (~25-80% less CPU usage, ~30-50% less RAM usage according to our benchmarks). Reduces query duration by a few percent on queries with lots of series.
- #5690 Compact: update
--debug.accept-malformed-index
flag to apply to downsampling. Previously the flag only applied to compaction, and fatal errors would still occur when downsampling was attempted. - #5707 Objstore: Update objstore to latest version which includes a refactored Azure Storage Account implementation with a new SDK.
- #5641 Store: Remove hardcoded labels in shard matcher.
- #5641 Query: Inject unshardable le label in query analyzer.
- #5685 Receive: Make active/head series limiting configuration per tenant by adding it to new limiting config.
- #5411 Tracing: Change Jaeger exporter from OpenTracing to OpenTelemetry. Options
RPC Metrics
,Gen128Bit
andDisabled
are now deprecated and won't have any effect when set⚠️ . - #5767 *: Upgrade Prometheus to v2.39.0.
- #5771 *: Upgrade Prometheus to v2.39.1.
Full Changelog: v0.28.1...v0.29.0-rc.0
v0.28.1
v0.28.0
What's Changed
Fixed
- #5502 Receive: Handle exemplar storage errors as conflict error.
- #5534 Query: Set struct return by query API alerts same as prometheus API.
- #5554 Query/Receiver: Fix querying exemplars from multi-tenant receivers.
- #5583 Query: Fix data race between Respond() and query/queryRange functions. Fixes #5410.
Added
- #5440 HTTP metrics: export number of in-flight HTTP requests.
- #5424 Receive: Export metrics regarding size of remote write requests.
- #5420 Receive: Automatically remove stale tenants.
- #5472 Receive: Add new tenant metrics to example dashboard.
- #5475 Compact/Store: Added
--block-files-concurrency
allowing to configure number of go routines for downloading and uploading block files during compaction. - #5470 Receive: Expose TSDB stats as metrics for all tenants.
- #5493 Compact: Added
--compact.blocks-fetch-concurrency
allowing to configure number of goroutines for downloading blocks during compactions. - #5480 Query: Expose endpoint info timeout as a hidden flag
--endpoint.info-timeout
. - #5527 Receive: Add per request limits for remote write. Added four new hidden flags
--receive.write-request-limits.max-size-bytes
,--receive.write-request-limits.max-series
,--receive.write-request-limits.max-samples
and--receive.write-request-limits.max-concurrency
for limiting requests max body size, max amount of series, max amount of samples and max amount of concurrent requests. - #5520 Receive: Meta-monitoring based active series limiting (experimental). This mode is only available if Receiver is in Router or RouterIngestor mode, and config is provided. Added four new hidden flags
receive.tenant-limits.max-head-series
for the max active series for the tenant,receive.tenant-limits.meta-monitoring-url
for the Meta-monitoring URL,receive.tenant-limits.meta-monitoring-query
for specifying the PromQL query to execute andreceive.tenant-limits.meta-monitoring-client
for specifying HTTP client configs. - #5555 Query: Added
--query.active-query-path
flag, allowing the user to configure the directory to create an active query tracking file,queries.active
, for different resolution. - #5566 Receive: Added experimental support to enable chunk write queue via
--tsdb.write-queue-size
flag. - #5575 Receive: Add support for gRPC compression with snappy.
- #5508 Receive: Validate labels in write requests.
- #5439 Mixin: Add Alert ThanosQueryOverload to Mixin.
- #5342 Query/Query Frontend: Implement vertical sharding at query frontend for range queries.
- #5561 Query Frontend: Support instant query vertical sharding.
- #5453 Compact: Skip erroneous empty non
*AggrChunk
chunks during 1h downsampling of 5m resolution blocks. - #5607 Query: Support custom lookback delta from request in query api.
Changed
- #5447 Promclient: Ignore 405 status codes for Prometheus buildVersion requests.
- #5451 Azure: Reduce memory usage by not buffering file downloads entirely in memory.
- #5484 Update Prometheus deps to v2.36.2.
- #5511 Update Prometheus deps to v2.37.0.
- #5588 Store: Improve index header reading performance by sorting values first.
- #5596 Store: Filter external labels from matchers on LabelValues/LabelNames to improve performance.
New Contributors
- @nikitapecasa made their first contribution in #5448
- @shenxn made their first contribution in #5455
- @SrushtiSapkale made their first contribution in #5447
- @chris-ng-scmp made their first contribution in #5466
- @olasd made their first contribution in #5477
- @BouchaaraAdil made their first contribution in #5465
- @eharcevs made their first contribution in #5453
- @bishal7679 made their first contribution in #5486
- @naveensrinivasan made their first contribution in #5364
- @Firxiao made their first contribution in #5496
- @Akshit42-hue made their first contribution in #5529
- @audig made their first contribution in #5534
- @Juneezee made their first contribution in #5574
- @raptorsun made their first contribution in #5439
- @oronsh made their first contribution in #5596
- @jzelinskie made their first contribution in #5611
Full Changelog: v0.27.0...v0.28.0
v0.28.0-rc.0
What's Changed
Fixed
- #5502 Receive: Handle exemplar storage errors as conflict error.
- #5534 Query: Set struct return by query API alerts same as prometheus API.
- #5554 Query/Receiver: Fix querying exemplars from multi-tenant receivers.
- #5583 Query: Fix data race between Respond() and query/queryRange functions. Fixes #5410.
Added
- #5440 HTTP metrics: export number of in-flight HTTP requests.
- #5424 Receive: Export metrics regarding size of remote write requests.
- #5420 Receive: Automatically remove stale tenants.
- #5472 Receive: Add new tenant metrics to example dashboard.
- #5475 Compact/Store: Added
--block-files-concurrency
allowing to configure number of go routines for downloading and uploading block files during compaction. - #5470 Receive: Expose TSDB stats as metrics for all tenants.
- #5493 Compact: Added
--compact.blocks-fetch-concurrency
allowing to configure number of goroutines for downloading blocks during compactions. - #5480 Query: Expose endpoint info timeout as a hidden flag
--endpoint.info-timeout
. - #5527 Receive: Add per request limits for remote write. Added four new hidden flags
--receive.write-request-limits.max-size-bytes
,--receive.write-request-limits.max-series
,--receive.write-request-limits.max-samples
and--receive.write-request-limits.max-concurrency
for limiting requests max body size, max amount of series, max amount of samples and max amount of concurrent requests. - #5520 Receive: Meta-monitoring based active series limiting (experimental). This mode is only available if Receiver is in Router or RouterIngestor mode, and config is provided. Added four new hidden flags
receive.tenant-limits.max-head-series
for the max active series for the tenant,receive.tenant-limits.meta-monitoring-url
for the Meta-monitoring URL,receive.tenant-limits.meta-monitoring-query
for specifying the PromQL query to execute andreceive.tenant-limits.meta-monitoring-client
for specifying HTTP client configs. - #5555 Query: Added
--query.active-query-path
flag, allowing the user to configure the directory to create an active query tracking file,queries.active
, for different resolution. - #5566 Receive: Added experimental support to enable chunk write queue via
--tsdb.write-queue-size
flag. - #5575 Receive: Add support for gRPC compression with snappy.
- #5508 Receive: Validate labels in write requests.
- #5439 Mixin: Add Alert ThanosQueryOverload to Mixin.
- #5342 Query/Query Frontend: Implement vertical sharding at query frontend for range queries.
- #5561 Query Frontend: Support instant query vertical sharding.
- #5453 Compact: Skip erroneous empty non
*AggrChunk
chunks during 1h downsampling of 5m resolution blocks.- #5607 Query: Support custom lookback delta from request in query api. - #5607 Query: Support custom lookback delta from request in query api.
Changed
- #5447 Promclient: Ignore 405 status codes for Prometheus buildVersion requests.
- #5451 Azure: Reduce memory usage by not buffering file downloads entirely in memory.
- #5484 Update Prometheus deps to v2.36.2.
- #5511 Update Prometheus deps to v2.37.0.
- #5588 Store: Improve index header reading performance by sorting values first.
- #5596 Store: Filter external labels from matchers on LabelValues/LabelNames to improve performance.
New Contributors
- @nikitapecasa made their first contribution in #5448
- @shenxn made their first contribution in #5455
- @SrushtiSapkale made their first contribution in #5447
- @chris-ng-scmp made their first contribution in #5466
- @olasd made their first contribution in #5477
- @BouchaaraAdil made their first contribution in #5465
- @eharcevs made their first contribution in #5453
- @bishal7679 made their first contribution in #5486
- @naveensrinivasan made their first contribution in #5364
- @Firxiao made their first contribution in #5496
- @Akshit42-hue made their first contribution in #5529
- @audig made their first contribution in #5534
- @Juneezee made their first contribution in #5574
- @raptorsun made their first contribution in #5439
- @oronsh made their first contribution in #5596
- @jzelinskie made their first contribution in #5611
Full Changelog: v0.27.0...v0.28.0-rc.0
v0.27.0
What's Changed
Fixed
- #5339 Receive: When running in routerOnly mode, an interupt (SIGINT) will now exit the process.
- #5357 Store: Fix groupcache handling by making sure slashes in the cache's key are not getting interpreted by the router anymore.
- #5427 Receive: Fix Ketama hashring replication consistency. With the Ketama hashring, replication is currently handled by choosing subsequent nodes in the list of endpoints. This can lead to existing nodes getting more series when the hashring is scaled. This change makes replication to choose subsequent nodes from the hashring which should not create new series in old nodes when the hashring is scaled. Ketama hashring can be used by setting
--receive.hashrings-algorithm=ketama
.
Added
- #5337 Thanos Object Store: Add the
prefix
option to buckets. - #5409 S3: Add option to force DNS style lookup.
- #5352 Cache: Add cache metrics to groupcache:
thanos_cache_groupcache_bytes
,thanos_cache_groupcache_evictions_total
,thanos_cache_groupcache_items
andthanos_cache_groupcache_max_bytes
. - #5391 Receive: Add relabeling support with the flag
--receive.relabel-config-file
or alternatively--receive.relabel-config
. - #5408 Receive: Add support for consistent hashrings. The flag
--receive.hashrings-algorithm
uses defaulthashmod
but can also be set toketama
to leverage consistent hashrings. More technical information can be found here: https://dgryski.medium.com/consistent-hashing-algorithmic-tradeoffs-ef6b8e2fcae8. - #5402 Receive: Implement api/v1/status/tsdb.
Changed
- #5410 Query: Close() after using query. This should reduce bumps in memory allocations.
- #5417 Ruler: Breaking if you have not set this value (
--eval-interval
) yourself and rely on that value.⚠️ . Change the default evaluation interval from 30s to 1 minute in order to be compliant with Prometheus alerting compliance specification: https://github.com/prometheus/compliance/blob/main/alert_generator/specification.md#executing-an-alerting-rule.
New Contributors
- @jmjf made their first contribution in #5319
- @fgouteroux made their first contribution in #5339
- @heylongdacoder made their first contribution in #5324
- @bisakhmondal made their first contribution in #5239
- @djdongjin made their first contribution in #5383
- @nicolastakashi made their first contribution in #5387
- @roastiek made their first contribution in #5394
- @B0go made their first contribution in #5392
- @jademcosta made their first contribution in #5337
- @dudaduarte made their first contribution in #5337
- @Jakob3xD made their first contribution in #5409
- @4xoc made their first contribution in #5153
Full Changelog: v0.26.0...v0.27.0
v0.27.0-rc.0
What's Changed
Fixed
- #5339 Receive: When running in routerOnly mode, an interupt (SIGINT) will now exit the process.
- #5357 Store: Fix groupcache handling by making sure slashes in the cache's key are not getting interpreted by the router anymore.
- #5427 Receive: Fix Ketama hashring replication consistency. With the Ketama hashring, replication is currently handled by choosing subsequent nodes in the list of endpoints. This can lead to existing nodes getting more series when the hashring is scaled. This change makes replication to choose subsequent nodes from the hashring which should not create new series in old nodes when the hashring is scaled. Ketama hashring can be used by setting
--receive.hashrings-algorithm=ketama
Added
- #5337 Thanos Object Store: Add the
prefix
option to buckets. - #5409 S3: Add option to force DNS style lookup.
- #5352 Cache: Add cache metrics to groupcache:
thanos_cache_groupcache_bytes
,thanos_cache_groupcache_evictions_total
,thanos_cache_groupcache_items
andthanos_cache_groupcache_max_bytes
- #5391 Receive: Add relabeling support with the flag
--receive.relabel-config-file
or alternatively--receive.relabel-config
- #5408 Receive: Add support for consistent hashrings. The flag
--receive.hashrings-algorithm
uses defaulthashmod
but can also be set toketama
to leverage consistent hashrings. More technical information can be found here: https://dgryski.medium.com/consistent-hashing-algorithmic-tradeoffs-ef6b8e2fcae8 - #5402 Receive: Implement api/v1/status/tsdb.
Changed
- #5410 Query: Close() after using query. This should reduce bumps in memory allocations.
- #5417 Ruler: Breaking if you have not set this value (
--eval-interval
) yourself and rely on that value.⚠️ . Change the default evaluation interval from 30s to 1 minute in order to be compliant with Prometheus alerting compliance specification: https://github.com/prometheus/compliance/blob/main/alert_generator/specification.md#executing-an-alerting-rule.
New Contributors
- @jmjf made their first contribution in #5319
- @fgouteroux made their first contribution in #5339
- @heylongdacoder made their first contribution in #5324
- @bisakhmondal made their first contribution in #5239
- @djdongjin made their first contribution in #5383
- @nicolastakashi made their first contribution in #5387
- @roastiek made their first contribution in #5394
- @B0go made their first contribution in #5392
- @jademcosta made their first contribution in #5337
- @Jakob3xD made their first contribution in #5409
- @4xoc made their first contribution in #5153
Full Changelog: v0.26.0...v0.27.0-rc.0
v0.26.0
What's Changed
Fixed
- #5281 Blocks: Use correct separators for filesystem paths and object storage paths respectively.
- #5300 Query: Ignore cache on queries with deduplication off.
Added
- #5220 Query Frontend: Add
--query-frontend.forward-header
flag, forward headers to downstream querier. - #5250 Querier: Expose Query and QueryRange APIs through GRPC.
- #5290 Add support for ppc64le
Changed
- #4838 Tracing: Chanced client for Stackdriver which deprecated "type: STACKDRIVER" in tracing YAML configuration. Use
type: GOOGLE_CLOUD
instead (STACKDRIVER
type remains for backward compatibility). - #5170 All: Upgraded the TLS version from TLS1.2 to TLS1.3.
- #5205 Rule: Add ruler labels as external labels in stateless ruler mode.
- #5206 Cache: Add timeout for groupcache's fetch operation.
- #5218 Tools: Thanos tools bucket downsample is now running continously.
- #5231 Tools: Bucket verify tool ignores blocks with deletion markers.
- #5244 Query: Promote negative offset and
@
modifier to stable features as per Prometheus #10121. - #5255 InfoAPI: Set store API unavailable when stores are not ready.
- #5256 Update Prometheus deps v2.33.5.
- #5271 DNS: Fix miekgdns resolver to work with CNAME records too.
Removed
- #5145 UI: Remove old Prometheus UI.
New Contributors
- @tomas-mota made their first contribution in #5202
- @appit-online made their first contribution in #5170
- @pablo-ruth made their first contribution in #5224
- @lcasi made their first contribution in #5220
- @dimitarvdimitrov made their first contribution in #5229
- @guilhermef made their first contribution in #5267
- @Zophar78 made their first contribution in #5273
- @jgbernalp made their first contribution in #5233
- @Ebaneck made their first contribution in #5289
- @mgiessing made their first contribution in #5290
Full Changelog: v0.25.2...v0.26.0