Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Info and StateSet types cause errors when scrapped by Prometheus #2248

Closed
defenestration opened this issue Nov 22, 2023 · 6 comments · Fixed by #2270
Closed

Info and StateSet types cause errors when scrapped by Prometheus #2248

defenestration opened this issue Nov 22, 2023 · 6 comments · Fixed by #2270
Assignees
Labels
kind/bug Categorizes issue or PR as related to a bug. triage/accepted Indicates an issue or PR is ready to be actively worked on.

Comments

@defenestration
Copy link

What happened:
When kube-state-metrics is configured with custom resources that produce Info and/or StateSet type of metrics, and Prometheus is configured with --enable-feature=native-histograms, prometheus is unable to scrape the kube-state-metrics target.

Filed an issue with prometheus here prometheus/prometheus#13172 but seems like a possible issue with KSM as well.

What you expected to happen:

Successful scraping of metrics.

How to reproduce it (as minimally and precisely as possible):

Run prometheus with --enable-feature=native-histograms and setup kube-state-metrics with a metric of type: Info or Stateset.

See a fluxcd/flux2-monitoring-example#14 or defenestration/flux2-monitoring-example#1 for an example setup.

Anything else we need to know?:

With the native-histograms feature disabled, prometheus seems to be able to scrape successfully. So a bug is also filed with prometheus.

I guess the suggestion would be to include a setting on the custom resource to produce a promehteus-compatible type on the /metrics page for these custom resource types.

Environment:

  • kube-state-metrics version: registry.k8s.io/kube-state-metrics/kube-state-metrics:v2.10.1
  • Kubernetes version (use kubectl version): v1.27.7+k3s2
  • Cloud provider or hardware configuration: kind
  • Other info:
@defenestration defenestration added the kind/bug Categorizes issue or PR as related to a bug. label Nov 22, 2023
@k8s-ci-robot k8s-ci-robot added the needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. label Nov 22, 2023
@CatherineF-dev
Copy link
Contributor

Could you provide detailed steps on reproducing this issue? You can follow #2223 (comment)

@dgrisonnet
Copy link
Member

The problem seem to be that when Prometheus has --native-histograms enabled, it tries to negociate for protobuf first and we manually force the text format instead since proto isn't supported:

// We do not support protobuf at the moment. Fall back to FmtText if the negotiated exposition format is not FmtOpenMetrics See: https://github.com/kubernetes/kube-state-metrics/issues/2022
if contentType != expfmt.FmtOpenMetrics_1_0_0 && contentType != expfmt.FmtOpenMetrics_0_0_1 {
contentType = expfmt.FmtText
}
.

However, there doesn't seem to be any mechanism in custom-state-metrics to revert back to Prometheus types, so we continue to expose OpenMetrics types whilst the content-type is text.

I think we should rework #1930 to enable the OpenMetrics types only when the OpenMetrics format is negociated.

@mrueg
Copy link
Member

mrueg commented Nov 28, 2023

Yes we can definitely improve the negotiation part and contributions that do that are welcome.
My current suggestion for to workaround this is not to use Info or StateSet types in your custom config for CRS.

@dgrisonnet
Copy link
Member

/assign @rexagod
/triage accepted

@k8s-ci-robot k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Nov 30, 2023
@rexagod
Copy link
Member

rexagod commented Dec 13, 2023

IIUC All unrecognized types are categorized as COUNTERs (also, here), which are validated later on when consuming them in common, which is where this error pops up from.

As discussed with Damien in a call earlier, the best way forward to bypass the absence of certain OpenMetrics types in Prometheus, would be to detect if a Proto negotiation is in effect, and try to specify a text-based format, since owing to the sheer amount of metrics that KSM generates, encoding these into metrics objects to facilitate things for KSM's own protobuf client would increase compute in a way that demands rigorous benchmarking beforehand.

EDIT: This detect-then-enforce approach would keep things stable for dashboards, etc., that have one-or-more use-cases for the HELP and TYPE texts. For eg., without this, users may end up seeing gauge for their info or stateset metrics. Additionally, both of these types are used within CRS, and not in native metrics definitions.

I'm wondering we might want to expose the scrape_protocols API in https://github.com/kubernetes/component-base/tree/master/metrics/prometheus and then bring that here (or introduce this here for now and replace that once we have an impl. in k/k, since we'll be needing that for kubernetes/kubernetes#119949 anyway.

@rexagod
Copy link
Member

rexagod commented Dec 13, 2023

Ah, scrape_protocols is a client-side API, I imagine this renders it irrelevant for the use-case here, although, we can still manually set headers from the supported list of text-based formats (as we do right now albeit not with protobufs in consideration).

So essentially, if the requested format is text-based, do nothing, else s/info|stateset/gauge.

rexagod added a commit to rexagod/kube-state-metrics that referenced this issue Dec 13, 2023
Fallback to `gauge` metric type when the negotiated content-type is
`protofmt`-based, since Prometheus' protobuf machinery does not
recognize all OpenMetrics types (`info` and `statesets` in this
context).

Fixes: kubernetes#2248

Signed-off-by: Pranshu Srivastava <[email protected]>
rexagod added a commit to rexagod/kube-state-metrics that referenced this issue Dec 13, 2023
Fallback to `gauge` metric type when the negotiated content-type is
`protofmt`-based, since Prometheus' protobuf machinery does not
recognize all OpenMetrics types (`info` and `statesets` in this
context).

Fixes: kubernetes#2248

Signed-off-by: Pranshu Srivastava <[email protected]>
rexagod added a commit to rexagod/kube-state-metrics that referenced this issue Dec 13, 2023
Fallback to `gauge` metric type when the negotiated content-type is
`protofmt`-based, since Prometheus' protobuf machinery does not
recognize all OpenMetrics types (`info` and `statesets` in this
context).

Fixes: kubernetes#2248

Signed-off-by: Pranshu Srivastava <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants