Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MQE: subqueries #9664

Merged
merged 23 commits into from
Oct 21, 2024
Merged

MQE: subqueries #9664

merged 23 commits into from
Oct 21, 2024

Conversation

charleskorn
Copy link
Contributor

@charleskorn charleskorn commented Oct 18, 2024

What this PR does

This PR adds support for subqueries to MQE.

Our benchmarks show mild latency improvements in most cases over Prometheus' engine, and improvements in peak memory utilisation in most cases:

goos: darwin
goarch: arm64
pkg: github.com/grafana/mimir/pkg/streamingpromql/benchmarks
cpu: Apple M1 Pro
                                                                         │ Prometheus  │               Mimir                │
                                                                         │   sec/op    │   sec/op     vs base               │
Query/sum_over_time(a_1[10m:3m]),_instant_query-10                         156.0µ ± 4%   152.3µ ± 4%   -2.36% (p=0.015 n=6)
Query/sum_over_time(a_1[10m:3m]),_range_query_with_100_steps-10            165.6µ ± 2%   154.6µ ± 1%   -6.63% (p=0.002 n=6)
Query/sum_over_time(a_1[10m:3m]),_range_query_with_1000_steps-10           250.2µ ± 1%   245.1µ ± 2%   -2.03% (p=0.041 n=6)
Query/sum_over_time(a_100[10m:3m]),_instant_query-10                       1.116m ± 2%   1.075m ± 8%        ~ (p=0.065 n=6)
Query/sum_over_time(a_100[10m:3m]),_range_query_with_100_steps-10          1.882m ± 4%   1.691m ± 1%  -10.18% (p=0.002 n=6)
Query/sum_over_time(a_100[10m:3m]),_range_query_with_1000_steps-10         8.662m ± 0%   9.212m ± 1%   +6.36% (p=0.002 n=6)
Query/sum_over_time(a_2000[10m:3m]),_instant_query-10                      16.02m ± 1%   15.38m ± 1%   -3.99% (p=0.002 n=6)
Query/sum_over_time(a_2000[10m:3m]),_range_query_with_100_steps-10         30.11m ± 1%   26.48m ± 1%  -12.06% (p=0.002 n=6)
Query/sum_over_time(a_2000[10m:3m]),_range_query_with_1000_steps-10        160.5m ± 7%   171.7m ± 3%   +6.95% (p=0.015 n=6)
Query/sum_over_time(nh_1[10m:3m]),_instant_query-10                        202.3µ ± 1%   193.9µ ± 1%   -4.18% (p=0.002 n=6)
Query/sum_over_time(nh_1[10m:3m]),_range_query_with_100_steps-10           260.3µ ± 5%   246.1µ ± 2%   -5.45% (p=0.002 n=6)
Query/sum_over_time(nh_1[10m:3m]),_range_query_with_1000_steps-10          829.5µ ± 0%   800.1µ ± 2%   -3.54% (p=0.002 n=6)
Query/sum_over_time(nh_100[10m:3m]),_instant_query-10                      5.656m ± 3%   5.558m ± 3%   -1.73% (p=0.041 n=6)
Query/sum_over_time(nh_100[10m:3m]),_range_query_with_100_steps-10         10.78m ± 1%   10.32m ± 0%   -4.29% (p=0.002 n=6)
Query/sum_over_time(nh_100[10m:3m]),_range_query_with_1000_steps-10        62.86m ± 1%   58.10m ± 4%   -7.57% (p=0.002 n=6)
Query/sum_over_time(nh_2000[10m:3m]),_instant_query-10                     105.3m ± 0%   103.4m ± 0%   -1.77% (p=0.002 n=6)
Query/sum_over_time(nh_2000[10m:3m]),_range_query_with_100_steps-10        204.7m ± 1%   194.1m ± 5%   -5.21% (p=0.015 n=6)
Query/sum_over_time(nh_2000[10m:3m]),_range_query_with_1000_steps-10        1.265 ± 3%    1.154 ± 1%   -8.81% (p=0.002 n=6)
Query/sum(sum_over_time(a_1[10m:3m])),_instant_query-10                    158.2µ ± 2%   150.4µ ± 2%   -4.93% (p=0.002 n=6)
Query/sum(sum_over_time(a_1[10m:3m])),_range_query_with_100_steps-10       169.8µ ± 2%   157.0µ ± 2%   -7.53% (p=0.002 n=6)
Query/sum(sum_over_time(a_1[10m:3m])),_range_query_with_1000_steps-10      279.2µ ± 1%   250.4µ ± 2%  -10.33% (p=0.002 n=6)
Query/sum(sum_over_time(a_100[10m:3m])),_instant_query-10                  1.135m ± 2%   1.071m ± 3%   -5.63% (p=0.002 n=6)
Query/sum(sum_over_time(a_100[10m:3m])),_range_query_with_100_steps-10     1.929m ± 1%   1.688m ± 3%  -12.50% (p=0.002 n=6)
Query/sum(sum_over_time(a_100[10m:3m])),_range_query_with_1000_steps-10    9.272m ± 1%   9.539m ± 1%   +2.88% (p=0.002 n=6)
Query/sum(sum_over_time(a_2000[10m:3m])),_instant_query-10                 16.19m ± 4%   15.58m ± 3%   -3.75% (p=0.002 n=6)
Query/sum(sum_over_time(a_2000[10m:3m])),_range_query_with_100_steps-10    31.18m ± 1%   27.01m ± 1%  -13.36% (p=0.002 n=6)
Query/sum(sum_over_time(a_2000[10m:3m])),_range_query_with_1000_steps-10   183.2m ± 1%   178.0m ± 1%   -2.82% (p=0.002 n=6)
geomean                                                                    4.550m        4.329m        -4.87%

                                                                         │  Prometheus   │                Mimir                │
                                                                         │       B       │      B        vs base               │
Query/sum_over_time(a_1[10m:3m]),_instant_query-10                          73.68Mi ± 1%   73.67Mi ± 1%        ~ (p=0.784 n=6)
Query/sum_over_time(a_1[10m:3m]),_range_query_with_100_steps-10             73.23Mi ± 1%   73.53Mi ± 1%        ~ (p=0.132 n=6)
Query/sum_over_time(a_1[10m:3m]),_range_query_with_1000_steps-10            70.84Mi ± 1%   71.88Mi ± 1%   +1.48% (p=0.009 n=6)
Query/sum_over_time(a_100[10m:3m]),_instant_query-10                        67.51Mi ± 1%   66.95Mi ± 1%        ~ (p=0.132 n=6)
Query/sum_over_time(a_100[10m:3m]),_range_query_with_100_steps-10           67.80Mi ± 1%   67.02Mi ± 0%   -1.15% (p=0.002 n=6)
Query/sum_over_time(a_100[10m:3m]),_range_query_with_1000_steps-10          69.89Mi ± 1%   69.70Mi ± 1%        ~ (p=1.000 n=6)
Query/sum_over_time(a_2000[10m:3m]),_instant_query-10                       68.45Mi ± 2%   69.11Mi ± 1%        ~ (p=0.132 n=6)
Query/sum_over_time(a_2000[10m:3m]),_range_query_with_100_steps-10          75.21Mi ± 1%   77.66Mi ± 1%   +3.26% (p=0.002 n=6)
Query/sum_over_time(a_2000[10m:3m]),_range_query_with_1000_steps-10         133.3Mi ± 1%   127.6Mi ± 0%   -4.26% (p=0.002 n=6)
Query/sum_over_time(nh_1[10m:3m]),_instant_query-10                         78.29Mi ± 1%   78.73Mi ± 1%        ~ (p=0.394 n=6)
Query/sum_over_time(nh_1[10m:3m]),_range_query_with_100_steps-10            73.31Mi ± 1%   72.88Mi ± 0%        ~ (p=0.065 n=6)
Query/sum_over_time(nh_1[10m:3m]),_range_query_with_1000_steps-10           72.63Mi ± 1%   71.93Mi ± 1%   -0.97% (p=0.015 n=6)
Query/sum_over_time(nh_100[10m:3m]),_instant_query-10                       70.20Mi ± 1%   70.10Mi ± 1%        ~ (p=1.000 n=6)
Query/sum_over_time(nh_100[10m:3m]),_range_query_with_100_steps-10          73.68Mi ± 1%   73.95Mi ± 1%        ~ (p=0.180 n=6)
Query/sum_over_time(nh_100[10m:3m]),_range_query_with_1000_steps-10         120.0Mi ± 1%   121.8Mi ± 1%   +1.44% (p=0.004 n=6)
Query/sum_over_time(nh_2000[10m:3m]),_instant_query-10                      76.34Mi ± 1%   73.81Mi ± 1%   -3.31% (p=0.002 n=6)
Query/sum_over_time(nh_2000[10m:3m]),_range_query_with_100_steps-10         180.4Mi ± 1%   184.1Mi ± 1%   +2.09% (p=0.002 n=6)
Query/sum_over_time(nh_2000[10m:3m]),_range_query_with_1000_steps-10        630.1Mi ± 0%   726.8Mi ± 7%  +15.35% (p=0.002 n=6)
Query/sum(sum_over_time(a_1[10m:3m])),_instant_query-10                     73.16Mi ± 1%   73.48Mi ± 1%        ~ (p=0.485 n=6)
Query/sum(sum_over_time(a_1[10m:3m])),_range_query_with_100_steps-10        72.80Mi ± 2%   73.14Mi ± 1%        ~ (p=0.394 n=6)
Query/sum(sum_over_time(a_1[10m:3m])),_range_query_with_1000_steps-10       70.45Mi ± 0%   71.66Mi ± 1%   +1.72% (p=0.004 n=6)
Query/sum(sum_over_time(a_100[10m:3m])),_instant_query-10                   67.38Mi ± 2%   66.86Mi ± 1%        ~ (p=0.260 n=6)
Query/sum(sum_over_time(a_100[10m:3m])),_range_query_with_100_steps-10      67.45Mi ± 1%   66.66Mi ± 1%   -1.17% (p=0.026 n=6)
Query/sum(sum_over_time(a_100[10m:3m])),_range_query_with_1000_steps-10     69.45Mi ± 2%   66.91Mi ± 1%   -3.66% (p=0.002 n=6)
Query/sum(sum_over_time(a_2000[10m:3m])),_instant_query-10                  69.12Mi ± 1%   68.40Mi ± 2%        ~ (p=0.065 n=6)
Query/sum(sum_over_time(a_2000[10m:3m])),_range_query_with_100_steps-10     76.16Mi ± 1%   68.52Mi ± 2%  -10.03% (p=0.002 n=6)
Query/sum(sum_over_time(a_2000[10m:3m])),_range_query_with_1000_steps-10   132.88Mi ± 2%   71.20Mi ± 1%  -46.41% (p=0.002 n=6)
geomean                                                                     85.72Mi        83.71Mi        -2.34%

It is possible to improve the performance of the 1000 step cases, however to do this without affecting the performance of range vector selectors requires a bit of refactoring I'd prefer to do in a separate PR.

Which issue(s) this PR fixes or relates to

(none)

Checklist

  • Tests updated.
  • [n/a] Documentation added.
  • CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX].
  • [n/a] about-versioning.md updated with experimental features.

@charleskorn charleskorn marked this pull request as ready for review October 18, 2024 03:32
@charleskorn charleskorn requested review from tacole02 and a team as code owners October 18, 2024 03:32
Copy link
Contributor

@tacole02 tacole02 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

Copy link
Contributor

@jhesketh jhesketh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks really good to me, nice work!

I think there's just some minor improvements and extra tests we can do, but otherwise the approach and implementation turned out to be very neat with the structures we had in place :-).

There seems to be limited testing around NH in subqueries. We should add some more to ensure our buffers/reuse etc are correct. In doing so, it would also be nice to see some more nested sub-queries (eg 3+ sub-queries deep). Additionally different offsets or @ etc in the sub-queries would be good to check for correct step alignment.

Lastly, perhaps also a test, or a once off benchmark, for where we check at the extremes of having points at the end of the buffer outside the range (ie to check the previously discussed performance regression).

pkg/streamingpromql/types/fpoint_ring_buffer.go Outdated Show resolved Hide resolved
pkg/streamingpromql/types/hpoint_ring_buffer.go Outdated Show resolved Hide resolved
pkg/streamingpromql/types/operator.go Outdated Show resolved Hide resolved
pkg/streamingpromql/types/ring_buffer_test.go Show resolved Hide resolved
pkg/streamingpromql/types/ring_buffer_test.go Show resolved Hide resolved
pkg/streamingpromql/testdata/ours/subqueries.test Outdated Show resolved Hide resolved
pkg/streamingpromql/engine_test.go Outdated Show resolved Hide resolved
pkg/streamingpromql/engine_test.go Show resolved Hide resolved
@charleskorn
Copy link
Contributor Author

In doing so, it would also be nice to see some more nested sub-queries (eg 3+ sub-queries deep). Additionally different offsets or @ etc in the sub-queries would be good to check for correct step alignment.

There are already a bunch of tests that cover offsets, @ and nesting (albeit not three levels deep) in the upstream test cases. Is there something in particular that you'd like to see covered that's not already covered there?

@charleskorn
Copy link
Contributor Author

Lastly, perhaps also a test, or a once off benchmark, for where we check at the extremes of having points at the end of the buffer outside the range (ie to check the previously discussed performance regression).

Not sure I follow this sorry - this case is exercised for every range query with more than one step, and is covered by those benchmarks as well.

@jhesketh
Copy link
Contributor

In doing so, it would also be nice to see some more nested sub-queries (eg 3+ sub-queries deep). Additionally different offsets or @ etc in the sub-queries would be good to check for correct step alignment.

There are already a bunch of tests that cover offsets, @ and nesting (albeit not three levels deep) in the upstream test cases. Is there something in particular that you'd like to see covered that's not already covered there?

Where the step continues to move from the parent query at each depth level. (even if this doesn't really happen due to aligning with epoch).

Lastly, perhaps also a test, or a once off benchmark, for where we check at the extremes of having points at the end of the buffer outside the range (ie to check the previously discussed performance regression).

Not sure I follow this sorry - this case is exercised for every range query with more than one step, and is covered by those benchmarks as well.

I was thinking as a one-off benchmark that does it at an extreme, but you're probably right that this isn't necessary.

@charleskorn
Copy link
Contributor Author

Where the step continues to move from the parent query at each depth level. (even if this doesn't really happen due to aligning with epoch).

Done, I've added a test like this in 220e5de.

eval range from 0 to 4m step 20s sum_over_time(sum_over_time(metric[2m:30s])[3m:15s])
{} 0 0 0 1 2 4 10 14 20 35 43 54 78

eval range from 0 to 4m step 3m sum_over_time(sum_over_time(sum_over_time(metric[2m:30s])[3m:15s])[4m:20s])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Turtles all the way down

Copy link
Contributor

@jhesketh jhesketh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚀

@charleskorn charleskorn enabled auto-merge (squash) October 21, 2024 02:09
@charleskorn charleskorn merged commit 3031f23 into main Oct 21, 2024
31 checks passed
@charleskorn charleskorn deleted the charleskorn/mqe-subqueries branch October 21, 2024 02:19
charleskorn added a commit that referenced this pull request Oct 21, 2024
* Enable upstream tests

* Add benchmark

* Introduce feature toggle

* Don't assume all operators are running at the top level of a query

* Add ability to reuse an existing point slice for a ring buffer

* Add `Release` method to ring buffers

* Introduce range query tests

* Bring in TestSubquerySelector from Prometheus

* Change `RangeVectorOperator.NextStepSamples` to return ring buffers rather than receive them

* Refactor `TestSubqueries`

* Initial (largely working) implementation

* Fix handling of @

* Enable newly supported upstream test cases

* Add further benchmark

* Add changelog entry

* Address PR feedback: clarify comments

Co-authored-by: Joshua Hesketh <[email protected]>

* Add tests for ring buffer `Release` implementations

* Address PR feedback: update comment to match new behaviour

* Address PR feedback: fix indentation

* Expand native histogram tests

* Add test for deeply nested subqueries with changing step.

* Run test cases in TestSubqueries against Prometheus' engine too.

---------

Co-authored-by: Joshua Hesketh <[email protected]>
(cherry picked from commit 3031f23)

Co-authored-by: Charles Korn <[email protected]>
@jhesketh jhesketh mentioned this pull request Dec 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants