Add prometheus metrics #431

JessicaGreben · 2021-10-12T22:41:15Z

This PR is for issue #378 and adds the beginning scaffolding to export metrics for Prometheus. Ultimately we will fully migrate over to Prometheus instead of using the current Statsd metrics, but in the meantime we will need to support both.

The baseplate spec specifies that an active_requests metrics must be (ref: spec) emitted so this PR adds that as the first custom metric.

The /metrics endpoint has been added to the golang baseplate code via the baseplate-cookiecutter repo in this PR.

The next work that will be done in a different PR is to add the scaffolding that allows applications to generate custom metrics for their own purposes.

kylelemons

I thought the plan was to use promauto? This feels like a lot of machinery to me

metricsbp/prometheus.go

fishy · 2021-10-13T17:16:33Z

metricsbp/prometheus.go

+var activeRequests = promauto.NewGauge(prometheus.GaugeOpts{
+	Name: "active_requests",
+	Help: "The number of requests being handled by the service.",
+})


in statsd this is a "runtime gauge" instead of a plain gauge. a "runtime gauge" comes with two tags, the hostname and the pid, and auto add "runtime." prefix.

then we have some special handling on telegraf on everything under runtime. prefix: we strip the 2 special tags, and calculate the average/max/min/etc. from all the pods reporting this gauge. this is kind of like treating it as a histogram instead of a gauge.

we probably should have an internal spec discussion on how do we want to handle runtime gauges on prometheus first.

This doesn't really exist the same way in Prometheus. This is because Prometheus always includes the instance of the monitored target in every metric. For Go, there is only ever one process, so effectively every metric includes those tags.

sounds like me might want to add pid based on this: https://github.snooguts.net/reddit/baseplate.spec/pull/39/files#diff-ebe820b9ffcd96083b5e0695835dbada2bb1d2b3b158092ccbb947c95779129bR16

That part of the spec needs to be corrected for Prometheus.

Yeah, the PID part is only for einhorn-esque things

fishy · 2021-10-13T17:17:55Z

metricsbp/config.go

@@ -63,7 +63,8 @@ type Config struct {
 // with the global tracing hook registry.
 func InitFromConfig(ctx context.Context, cfg Config) io.Closer {
 	M = NewStatsd(ctx, cfg)
-	tracing.RegisterCreateServerSpanHooks(CreateServerSpanHook{Metrics: M})
+	pm := NewPrometheusMetrics(ctx, cfg)
+	tracing.RegisterCreateServerSpanHooks(CreateServerSpanHook{Metrics: M, PrometheusMetrics: pm})


as a style nit, I feel that the only appropriate case of writing a whole struct in one line is when you only fill zero or one of its fields. anything >1 should be write in one-per-line instead:

Suggested change

tracing.RegisterCreateServerSpanHooks(CreateServerSpanHook{Metrics: M, PrometheusMetrics: pm})

tracing.RegisterCreateServerSpanHooks(CreateServerSpanHook{

Metrics: M,

PrometheusMetrics: pm,

})

bjk-reddit · 2021-10-14T10:15:22Z

metricsbp/prometheus.go

+)
+
+var activeRequests = promauto.NewGauge(prometheus.GaugeOpts{
+	Name: "active_requests",


Is this specific to a protocol? Typically for this metric we would prefix it with the protocol.

This can be done with the Namespace. Something like this:

prometheus.GaugeOpts{ Namespace: "http", Name: "active_requests", ... }

I was thinking of adding a label with transport type like stated here just until the baseplate.go remodel next year when tracing and metrics are decoupled. For simplicity, I was going to leave this as is, but I can look into whats involved to move over into the specific protocol metrics that I'm adding shortly if thats better.

if we are doing per-transport active_requests then the server span hook is no longer the correct place to do it, because server span hook runs on all servers and has no knowledge which transport it is in. this needs to be done on the server middleware instead.

sounds like the best thing to do is move this to the per transport server middleware. will make that change.

kylelemons · 2021-10-15T17:22:22Z

metricsbp/baseplate_hooks.go

@@ -35,21 +38,23 @@ func (h CreateServerSpanHook) OnCreateServerSpan(span *tracing.Span) error {
 // ends, with success=True/False tags for the counter based on whether an error
 // was passed to `span.End` or not.
 type spanHook struct {
-	metrics *Statsd
+	metrics           *Statsd
+	prometheusMetrics *PrometheusMetrics


From what I've seen, the "best practice" for injecting prometheus metrics (especially when there's just one, like "active_requests" is to pass in the prometheus.Counter -- that also could allow you to do the "active_requests" per-transport here potentially, using prometheus.CounterVec.With to pre-populate the label.

Leaving this out to just do in the per-transport section also sounds fine if you prefer.

thanks for the details. sounds like i will close this PR and add active requests to the per transport PRs.

JessicaGreben · 2021-10-15T17:32:01Z

Closing this to add active_request to the per transport server middleware instead.

jessica.grebenschikov added 3 commits October 12, 2021 15:23

add prometheus metrics to metricsbp pkg

e01d67e

go mod tidy

d897f5f

Merge branch 'master' into sre-1153-add-prometheus

3ae101c

JessicaGreben requested a review from a team as a code owner October 12, 2021 22:41

JessicaGreben requested review from fishy and kylelemons and removed request for a team October 12, 2021 22:41

kylelemons requested changes Oct 12, 2021

View reviewed changes

bjk-reddit reviewed Oct 13, 2021

View reviewed changes

metricsbp/prometheus.go Outdated Show resolved Hide resolved

add promauto, rm indirect call to inc metric

d0697de

JessicaGreben requested a review from kylelemons October 13, 2021 13:55

fishy requested changes Oct 13, 2021

View reviewed changes

bjk-reddit reviewed Oct 14, 2021

View reviewed changes

kylelemons requested changes Oct 15, 2021

View reviewed changes

JessicaGreben closed this Oct 15, 2021

JessicaGreben deleted the sre-1153-add-prometheus branch October 15, 2021 17:32

JessicaGreben mentioned this pull request Oct 19, 2021

Add prometheus metrics for thriftbp server. #433

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add prometheus metrics #431

Add prometheus metrics #431

JessicaGreben commented Oct 12, 2021 •

edited

Loading

kylelemons left a comment

fishy Oct 13, 2021

bjk-reddit Oct 14, 2021 •

edited

Loading

JessicaGreben Oct 14, 2021

bjk-reddit Oct 15, 2021

kylelemons Oct 15, 2021 •

edited

Loading

fishy Oct 13, 2021

bjk-reddit Oct 14, 2021

JessicaGreben Oct 14, 2021 •

edited

Loading

fishy Oct 14, 2021

JessicaGreben Oct 15, 2021

kylelemons Oct 15, 2021

JessicaGreben Oct 15, 2021

JessicaGreben commented Oct 15, 2021

Add prometheus metrics #431

Add prometheus metrics #431

Conversation

JessicaGreben commented Oct 12, 2021 • edited Loading

kylelemons left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bjk-reddit Oct 14, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kylelemons Oct 15, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

JessicaGreben Oct 14, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

JessicaGreben commented Oct 15, 2021

JessicaGreben commented Oct 12, 2021 •

edited

Loading

bjk-reddit Oct 14, 2021 •

edited

Loading

kylelemons Oct 15, 2021 •

edited

Loading

JessicaGreben Oct 14, 2021 •

edited

Loading