Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support ruler to retrieve proto format query response #6345

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

SungJin1212
Copy link
Contributor

@SungJin1212 SungJin1212 commented Nov 18, 2024

In #6151, we can retrieve query results from Query frontends when the rules are evaluated. The limitation was, since we only support json format, we cannot retrieve query results including the native histogram.

To address this concern, this PR adds -ruler.query-response-format flag, which specifies which format to use when retrieving query results from the Query frontend. The supported values are protobuf and json, so we can retrieve proto format query results when we set the value to protobuf.

Which issue(s) this PR fixes:
Fixes #

Checklist

  • Tests updated
  • Documentation added
  • CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]

@dosubot dosubot bot added the component/rules Bits & bobs todo with rules and alerts: the ruler, config service etc. label Nov 18, 2024
@SungJin1212 SungJin1212 force-pushed the Add-protobuf-codec-to-ruler branch 4 times, most recently from 872120e to 5f93620 Compare November 18, 2024 06:56
@SungJin1212 SungJin1212 changed the title Allow ruler to retrieve proto format query response Support ruler to retrieve proto format query response Nov 19, 2024
@SungJin1212 SungJin1212 force-pushed the Add-protobuf-codec-to-ruler branch 2 times, most recently from 6915258 to fb61f5a Compare November 19, 2024 08:16
@SungJin1212 SungJin1212 requested a review from yeya24 November 19, 2024 08:36
@@ -176,7 +183,7 @@ func (instantQueryCodec) EncodeResponse(ctx context.Context, res tripperware.Res
return nil, httpgrpc.Errorf(http.StatusInternalServerError, "invalid response format")
}

b, err := json.Marshal(a)
contentType, b, err := marshalResponse(a, req.Header.Get("Accept"))
Copy link
Contributor

@yeya24 yeya24 Nov 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we only try to return non JSON response format if we know the request comes from Ruler?
If the request doesn't come from Ruler I feel we want to enforce JSON response even though they specify protobuf in their protocol like Accept: protobuf, json. To reduce impact as this protobuf format is Cortex specific.

But tbh Idk if we can identify ruler requests or not. We probably have to do what you are doing here.

WDYT? @alanprot @danielblando

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yeya24
One way is to define a new MIME type like application/x-protobuf-ruler (just example) and then use it to accept header.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah.. maybe we should define our internal MIME type so external calls will only get that format if they really wanna.
I would not add "ruler" in the format though. Maybe something like application/x-cortex.query.v1+proto (idk the latest best practices to define those custom Mime types :P )

Copy link
Contributor

@yeya24 yeya24 Nov 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sounds worth trying. Let's hear from other maintainers, too

@SungJin1212 SungJin1212 force-pushed the Add-protobuf-codec-to-ruler branch from fb61f5a to 80d5e87 Compare November 25, 2024 06:20
@SungJin1212
Copy link
Contributor Author

@yeya24 @alanprot
I defined the application/x-cortex-query+proto MIME type and used it to communicate between the Ruler and Query Frontend.

@SungJin1212 SungJin1212 force-pushed the Add-protobuf-codec-to-ruler branch from 80d5e87 to ec03c22 Compare November 25, 2024 11:11
Comment on lines 1739 to 1801
{
name: "proto type",
acceptHeader: "application/x-protobuf",
expectedContentType: "application/json",
},
Copy link
Contributor Author

@SungJin1212 SungJin1212 Nov 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can enforce json marshal if we get the other types except the application/x-cortex-query+proto.

@SungJin1212 SungJin1212 force-pushed the Add-protobuf-codec-to-ruler branch from ec03c22 to 6c4fe46 Compare November 26, 2024 00:09
@SungJin1212 SungJin1212 force-pushed the Add-protobuf-codec-to-ruler branch from 6c4fe46 to 0b4bf23 Compare December 4, 2024 07:15
@SungJin1212 SungJin1212 force-pushed the Add-protobuf-codec-to-ruler branch 2 times, most recently from 86a0a52 to 27d40bd Compare December 4, 2024 07:25
@SungJin1212
Copy link
Contributor Author

SungJin1212 commented Dec 4, 2024

@yeya24
Could you take a look when you have time :D?

pkg/querier/codec/protobuf_codec_test.go Outdated Show resolved Hide resolved
pkg/querier/tripperware/instantquery/instant_query.go Outdated Show resolved Hide resolved
@@ -154,7 +161,7 @@ func (c instantQueryCodec) EncodeRequest(ctx context.Context, r tripperware.Requ
}
}

tripperware.SetRequestHeaders(h, c.defaultCodecType, c.compression)
tripperware.SetRequestHeaders(h, c.defaultCodecType, c.compression, c.rulerQueryResponseFormat)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will always ask for cortex internal response format in instant query even though the requests are not initiated from Ruler?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh.. I see.
We can utilize the User-Agent header value at DecodeRequest and EncodeRequest.

@SungJin1212 SungJin1212 force-pushed the Add-protobuf-codec-to-ruler branch 4 times, most recently from 194aaa3 to 5a369e5 Compare December 6, 2024 01:29
@SungJin1212 SungJin1212 requested a review from yeya24 December 6, 2024 01:35
@SungJin1212 SungJin1212 force-pushed the Add-protobuf-codec-to-ruler branch from 5a369e5 to 26ee074 Compare December 6, 2024 02:11
if strings.EqualFold(h, header) {
result.Headers[h] = hv
break
if isSourceRuler(&r.Header) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: We can initialize isSourceRuler(&r.Header) as a var.

return tripperware.ApplicationJson, b, err
}

func isSourceRuler(h *http.Header) bool {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like that we identified query traffic and ruler traffic now.
Can we also add this to frontend/transport/handler.go? We can have additional logging about the query source and update metrics (like queries from rule or API, etc).

Copy link
Contributor Author

@SungJin1212 SungJin1212 Dec 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean to add a label (source) to query stats metrics?

h.queryScannedSamples = promauto.With(reg).NewCounterVec(prometheus.CounterOpts{
			Name: "cortex_query_samples_scanned_total",
			Help: "Number of samples scanned to execute a query.",
		}, []string{"user", "source"})

And add a log field to ServeHTTP. right?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean to add a label (source) to query stats metrics?

We don't need to add this for all metrics. Only the counter for number of requests should be good enough

Copy link
Contributor Author

@SungJin1212 SungJin1212 Dec 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't have a counter metric in the handler.
Should I add it?
FYI. We have a cortex_ruler_query_frontend_request_duration_seconds histogram.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I add a User-Agent field to logger.
logger := util_log.WithUserAgent(userAgent, util_log.WithContext(r.Context(), f.log))

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Umm, we actually already have one called queriesPerTenant but it is in roundtrip.go. Sorry for the wrong info.

Copy link
Contributor Author

@SungJin1212 SungJin1212 Dec 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks!
I add a source label to cortex_query_frontend_queries_total.
The label value is ruler or others.
WDYT?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@SungJin1212 SungJin1212 force-pushed the Add-protobuf-codec-to-ruler branch 3 times, most recently from 7c686bf to d988c9e Compare December 10, 2024 07:52
@SungJin1212 SungJin1212 force-pushed the Add-protobuf-codec-to-ruler branch from d988c9e to d3f99c0 Compare December 11, 2024 01:30
Copy link
Contributor

@yeya24 yeya24 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is in good shape now... Some nites only.
Would be good to have more people to review this change.

pkg/ruler/frontend_decoder.go Outdated Show resolved Hide resolved
@@ -18,8 +20,18 @@ const (
)

type JsonDecoder struct{}
type ProtobufDecoder struct{}
type Warning []string
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Warnings?

Copy link
Contributor Author

@SungJin1212 SungJin1212 Dec 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you mean is we should return nil instead of the resp.Warnings at the ProtobufDecoder decode func since the error is only decoded by the json codec?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No... Just the naming thing. Better to call it Warnings instead of Warning

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, fixed it.

@SungJin1212 SungJin1212 force-pushed the Add-protobuf-codec-to-ruler branch from d3f99c0 to 437a332 Compare December 12, 2024 01:17
@yeya24
Copy link
Contributor

yeya24 commented Dec 12, 2024

@rajagopalanand @rapphil Maybe you can help review this PR as well. Thanks

@SungJin1212 SungJin1212 force-pushed the Add-protobuf-codec-to-ruler branch 4 times, most recently from 31f2cce to af3e05b Compare December 18, 2024 09:18
@SungJin1212
Copy link
Contributor Author

@alanprot
Could you take a look?

@@ -147,6 +149,7 @@ func TestProtobufCodec_Encode(t *testing.T) {
},
},
{
name: "matrix",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah.. let's add a name and delete the duplicate one. Thanks!

@@ -29,6 +31,9 @@ var (
SortMapKeys: true,
ValidateJsonRawMessage: false,
}.Froze()

rulerMIMEType = v1.MIMEType{Type: "application", SubType: tripperware.QueryResponseCortexMIMESubType}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we do rulerMIMEType = tripperware.QueryResponseCortexMIMEType here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tripperware.QueryResponseCortexMIMEType is a string, so it is hard to use it to MIMEType directly.

{
name: "empty accept header",
acceptHeader: "",
expectedContentType: "application/json",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: can we do expectedContentType: tripperware.ApplicationJson instead?

@@ -763,7 +769,8 @@ func SetRequestHeaders(h http.Header, defaultCodecType CodecType, compression Co
}

func UnmarshalResponse(r *http.Response, buf []byte, resp *PrometheusResponse) error {
if r.Header != nil && r.Header.Get("Content-Type") == ApplicationProtobuf {
contentType := r.Header.Get("Content-Type")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we check that r.Header != nil before getting the contentType?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for catching!

@@ -230,7 +230,7 @@ func (c prometheusCodec) DecodeResponse(ctx context.Context, r *http.Response, _
return &resp, nil
}

func (prometheusCodec) EncodeResponse(ctx context.Context, res tripperware.Response) (*http.Response, error) {
func (prometheusCodec) EncodeResponse(ctx context.Context, _ *http.Request, res tripperware.Response) (*http.Response, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems weird that we need to pass in a request here if it's not being used.

Copy link
Contributor Author

@SungJin1212 SungJin1212 Dec 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is just for the interface.
It is not used in the range query, since the ruler only does instant query.

@@ -156,7 +156,8 @@ func NewQueryTripperware(
now := time.Now()
userStr := tenant.JoinTenantIDs(tenantIDs)
activeUsers.UpdateUserTimestamp(userStr, now)
queriesPerTenant.WithLabelValues(op, userStr).Inc()
source := getSource(r.Header.Get("User-Agent"))
queriesPerTenant.WithLabelValues(op, userStr, source).Inc()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We probably shouldn't have the user-agent as a possible label value.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it can be set by the user, then there's a possibility of cardinality explosion.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call out. I think source is good enough and we don't need the user agent here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The label values would be ruler or api. Don't we?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nvm ignore my comment. We are not using user agent value directly as label value.

@SungJin1212 SungJin1212 force-pushed the Add-protobuf-codec-to-ruler branch from af3e05b to da4a4da Compare December 20, 2024 02:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component/rules Bits & bobs todo with rules and alerts: the ruler, config service etc. size/XL
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants