Skip to content

Commit

Permalink
update release-notes
Browse files Browse the repository at this point in the history
Signed-off-by: Vladimir Varankin <[email protected]>
  • Loading branch information
narqo committed Sep 27, 2024
1 parent 7dff8c3 commit 8b83d59
Showing 1 changed file with 29 additions and 60 deletions.
89 changes: 29 additions & 60 deletions docs/sources/mimir/release-notes/v2.14.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,46 +18,30 @@ For the complete list of changes, refer to the [CHANGELOG](https://github.com/gr

## Features and enhancements

The minimal supported version of Go is updated to 1.22.

The streaming of the chunks from store-gateways to queriers is enabled by default.

Alertmanager adds the `-alertmanager.log-parsing-label-matchers` configuration option to control logging when parsing label matchers.
This flag is intended to be used with `-alertmanager.utf8-strict-mode-enabled` to validate UTF-8 strict mode is working as intended.
The default value for this configuration option is `false`.

Alertmanager adds the `-alertmanager.utf8-migration-logging-enabled` configuration option, that enables logging for tenant configurations,
incompatible with UTF-8 strict mode. The default value for this configuration option is `false`.
The streaming of chunks from store-gateways to queriers is enabled. This reduces the memory usage in queriers. This was an experimental
feature since Mimir 2.10, and now it's enabled by default.

Compactor adds a new `cortex_compactor_disk_out_of_space_errors_total` counter metric that tracks how many times a compaction fails
due to the compactor being out of disk. A new related alert, `MimirCompactorHasRunOutOfDiskSpace`, and its runbook are added.
due to the compactor being out of disk.

The distributor now replies with the `Retry-After` header on retryable errors. This protects Mimir from clients, including Prometheus,
that default to retrying very quickly, making recovering from an outage easier. The feature was added in Mimir 2.11 but now is enabled by default.

Incoming OTLP requests were previously size-limited with the distributor's `-distributor.max-recv-msg-size` configuration.
The distributor has a new `-distributor.max-otlp-request-size` configuration for limiting OTLP requests. The default value is 100 MiB.

The distributor now replies with the `Retry-After` header on retryable errors. This feature is controlled by the `-distributor.retry-after-header.enabled` configuration
and is enabled by default.
Ingesters can be marked as read-only as part of their downscaling procedure. The new `prepare-instance-ring-downscale` endpoint updates the read-only
status of an ingester in the ring.

## Important changes

In Grafana Mimir 2.14, the following behavior has changed:

When running a remote read request, the querier honors the time range specified in the read hints.

To prevent queue starvation with the querier-worker queue prioritization algorithm, the querier sets the minimum `-querier.max-concurrent` value to four.
Values below the allowed minimum are ignored.

The distributor configurations `-distributor.retry-after-header.max-backoff-exponent` and `-distributor.retry-after-header.base-seconds` are replaced
with `-distributor.retry-after-header.min-backoff` and `-distributor.retry-after-header.max-backoff` for easier configuration.

The circuit breakers in ingester are improved to not open in case of per-instance limit errors. Since now on the circuit breakers
trigger the opening only if push and pull requests exceed the configured duration.

The default inactivity timeout of active series in ingesters, controlled by the `-ingester.active-series-metrics-idle-timeout` configuration,
is increased from `10m` to `20m`.

Query-frontend now returns the "413 Request Entity Too Large" error when a response shard from the `/active_series` request exceeds the limit.

The following featues of store-gateway are changed: `-blocks-storage.bucket-store.max-concurrent-queue-timeout` is set to five seconds;
`-blocks-storage.bucket-store.index-header.lazy-loading-concurrency-queue-timeout` is set to five seconds;
`-blocks-storage.bucket-store.max-concurrent` is set to 200;
Expand All @@ -79,59 +63,44 @@ The following deprecated configuration options were removed in this release:
## Experimental features

Grafana Mimir 2.14 includes some features that are experimental and disabled by default.
Use these feature with caution and report any issues that you encounter:

A new experimental Kafka-based ingest storage architecture is added, that decouples the write and read path through a Kafka-compatible backend.
When enabled, distributors write incoming write requests to the Kafka-compatible backend and the ingesters asynchronously replay ingested data from Kafka.

The following configuration options enable and configure the new architecture in Mimir: `-ingest-storage.enabled`, `-ingest-storage.kafka.*`,
`-ingest-storage.ingestion-partition-tenant-shard-size`, `-ingest-storage.read-consistency`, `-ingest-storage.migration.distributor-send-to-ingesters-enabled`,
`-ingester.partition-ring.*`.

Querier has an experimental streaming PromQL engine. The configuration option `-querier.query-engine=mimir` enables the engine.
Use these features with caution and report any issues that you encounter:

The ingester added an experimental `-ingester.ignore-ooo-exemplars` configuration. When set, out-of-order exemplars are no longer reported
to the remote write client.

The querier supports the experimental `limitk()` and `limit_ratio()` PromQL functions. This feature is disabled by default,
but you can enable it with the `-querier.promql-experimental-functions-enabled=true` setting in the query-frontend and the querier.

The querier supports the experimental `X-Mimir-Chunk-Info-Logger` header that triggers logging information about TSDB chunks loaded from ingesters and store-gateways in the querier.
The header should contain the comma separated list of labels, for which their value will be included in the logs.

The ruler includes the experimental `-ruler.rule-evaluation-write-enabled` configuration that disables writing the result of rule evaluation to ingesters.
This feature is meant to be used for testing.

## Bug fixes

- Ruler: add support for draining any outstanding alert notifications before shutting down. Enable this setting with the `-ruler.drain-notification-queue-on-shutdown=true` CLI flag.
- Query-frontend: fix `-querier.max-query-lookback` enforcement when `-compactor.blocks-retention-period` is not set, and viceversa.
- Alertmanager: fix configuration validation gap around unreferenced templates.
- Alertmanager: fix goroutine leak when stored configuration fails to apply and there is no existing tenant alertmanager.
- Alertmanager: fix receiver firewall to detect `0.0.0.0` and IPv6 interface-local multicast address as local addresses.
- Alertmanager: fix per-tenant silence limits not reloaded during runtime.
- Alertmanager: fix bugs in silences that could cause an existing silence to expire/be deleted when updating the silence fails. This could happen when the updated silence was invalid or exceeded limits.
- Alertmanager: fix help message for utf-8-strict-mode.
- Compactor: fix a race condition between different compactor replicas that may cause a deleted block to be referenced as non-deleted in the bucket index.
- Configuration: multi-line environment variables are flattened during injection to be compatible with YAML syntax.
- HA Tracker: store correct timestamp for the last-received request from the elected replica.
- Ingester: fix the sporadic `not found` error causing an internal server error if label names are queried with matchers during head compaction.
- Ingester, store-gateway: fix case insensitive regular expressions not correctly matching some Unicode characters.
- Query-frontend: "query stats" log includes the actual `status_code` when the request fails due to an error occurring in the query-frontend itself.
- Store-gateway: fixed a case where, on a quick subsequent restart, the previous lazy-loaded index header snapshot was overwritten by a partially loaded one.
- Ingester: fixed timestamp reported in the "the sample has been rejected because its timestamp is too old" error when the write request contains only histograms.
- Store-gateway: store sparse index headers atomically to disk.
- Query scheduler: fix a panic in request queueing.
- Query-frontend: fix `-querier.max-query-lookback` and `-compactor.blocks-retention-period` enforcement in query-frontend when one of the two is not set.
- Query-frontend: "query stats" log includes the actual `status_code` when the request fails due to an error occurring in the query-frontend itself.
- Query-frontend: ensure that internal errors result in an HTTP 500 response code instead of a 422 response code.
- Query-frontend: return annotations generated during evaluation of sharded queries.
- Query-scheduler: fix a panic in request queueing.
- Querier: fix the issue where "context canceled" is logged for trace spans for requests to store-gateways that return no series when chunks streaming is enabled.
- Alertmanager: Fix per-tenant silence limits not reloaded during runtime.
- Alertmanager: Fix bugs in silences that could cause an existing silence to expire/be deleted when updating the silence fails. This could happen when the updated silence was invalid or exceeded limits.
- Alertmanager: Fix help message for utf-8-strict-mode.
- Query-frontend: Ensure that internal errors result in an HTTP 500 response code instead of a 422 response code.
- Configuration: Multi-line environment variables are flattened during injection to be compatible with YAML syntax.
- Querier: fix issue where queries can return incorrect results if a single store-gateway returns overlapping chunks for a series.
- HA Tracker: store correct timestamp for the last-received request from the elected replica.
- Querier: do not return `grpc: the client connection is closing` errors as HTTP `499`.
- Compactor: fix a race condition between different compactor replicas that may cause a deleted block to be referenced as non-deleted in the bucket index.
- Querier: fix issue where some native histogram-related warnings were not emitted when `rate()` was used over native histograms.
- Ruler: map invalid org-id errors to the 400 status code.
- Querier: Fix invalid query results when multiple chunks are merged.
- Query-frontend: return annotations generated during evaluation of sharded queries.
- Querier: Support optional start and end times on `/prometheus/api/v1/labels`, `/prometheus/api/v1/label/<label>/values`, and `/prometheus/api/v1/series` when `max_query_into_future: 0`.
- Alertmanager: Fix configuration validation gap around unreferenced templates.
- Alertmanager: Fix goroutine leak when stored configuration fails to apply and there is no existing tenant alertmanager.
- Querier: fix invalid query results when multiple chunks are merged.
- Querier: support optional start and end times on `/prometheus/api/v1/labels`, `/prometheus/api/v1/label/<label>/values`, and `/prometheus/api/v1/series` when `max_query_into_future: 0`.
- Querier: fix issue where both recently compacted blocks and their source blocks can be skipped during querying if store-gateways are restarting.
- Alertmanager: fix receiver firewall to detect `0.0.0.0` and IPv6 interface-local multicast address as local addresses.
- Ruler: add support for draining any outstanding alert notifications before shutting down. Enable this setting with the `-ruler.drain-notification-queue-on-shutdown=true` CLI flag.
- Store-gateway: fixed a case where, on a quick subsequent restart, the previous lazy-loaded index header snapshot was overwritten by a partially loaded one.
- Store-gateway: store sparse index headers atomically to disk.
- Ruler: map invalid org-id errors to the 400 status code.

### Helm chart improvements

Expand Down

0 comments on commit 8b83d59

Please sign in to comment.