Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Query priority #5605

Merged
merged 49 commits into from
Nov 30, 2023
Merged

Conversation

justinjung04
Copy link
Contributor

@justinjung04 justinjung04 commented Oct 17, 2023

What this PR does:

Currently, each tenant has single FIFO request queue in the query path, where queriers assigned for the tenant will pick up next request from the queue.

Having a single FIFO queue results in situations where

  1. Slow queries exhaust all queriers and their concurrency
  2. Fast (or more important) queries get queued up, waiting for next querier to pick them up
  3. If the slow queries took 30 seconds, then the fast (or more important) queries now have 30 seconds added to their latency
  4. If the slow queries timed out, then the fast (or more important) queries are also likely to timeout in the queue (without even reaching querier)

image

image

This PR suggests to introduce new "query priority" for each tenant that does the following:

  1. Able to assign priorities to queriers that meet the query attributes (regex, start time, end time)
  2. When query priority is enabled, schedulers use priority queue that orders the requests by priority
  3. You can also have "reserved" queriers that only pick requests with a certain priority. This is useful when you want to guarantee certain number of queriers available to handle high priority requests at any given time.

image

image

Here's an example configuration:

  <tenant-id>:
    query_priority:
      enabled: true
      default_priority: 0
      priorities:
        - priority: 3
          reserved_queriers: 1
          query_attributes:
            - regex: "special_query"
        - priority: 2
          query_attributes:
            - regex: "max"
              time_window:
                start: 2h
                end: 0h
        - priority: 1
          query_attributes:
            - time_window:
                start: 24h

Which issue(s) this PR fixes:
n/a

End-to-end test result:

There were 4 things to verify from the end-to-end test:

  • Confirm high priority requests get dequeued more than lower priority requests
  • Confirm reserved queriers do not pick up low priority requests
  • Confirm the tenant limit config change gets picked up by query frontend and scheduler without the pods restarting
  • Check memory and cpu usage

In short, all of them were verified successfully with some cool remarks:

  • P50 and avg latency droped significantly, as my test was assigning higher priority to ingester queries (data within 24h)
  • Number of timeouts decreased, at the cost of more 429s and higher queue length (queue length increased with low queries not handled in time, which can be addressed by scaling up queriers in real world scenario)
  • Memory usage of QFE is better when the query priority is enabled
  • Memory usage of scheduler is slightly worse when the query priority is enabled, but negligible

Image 1: Metrics with new labels showing smooth transition between FIFO and priority queue.
Screenshot 2023-11-15 at 4 28 53 PM
Screenshot 2023-11-15 at 4 29 08 PM

Image 2: Change in QPS per status code and latency.
Screenshot 2023-11-15 at 4 29 36 PM

Image 3: Change in CPU and memory usage.
Screenshot 2023-11-15 at 4 30 46 PM
Screenshot 2023-11-15 at 4 31 02 PM

Image 4: One reserved querier waiting for priority 3 query that was never enqueued.
Screenshot 2023-11-15 at 4 31 39 PM

Test setup:

Queries ran continuously:

"query?query=max({__name__=~'metric_[0-4]'})&time=0h
"query?query=avg({__name__=~'metric_[1-5]'})&time=-12h
"query?query=min({__name__=~'metric_[2-6]'})&time=-32h

Query priority config used:

    query_priority:
      enabled: true
      default_priority: 0
      priorities:
        - priority: 3
          reserved_queriers: 1
          query_attributes:
            - regex: "special_query"
        - priority: 2
          query_attributes:
            - regex: "max"
              time_window:
                start: 2h
                end: 0h
        - priority: 1
          query_attributes:
            - time_window:
                start: 24h

Timeline:

14:00 | Applied new image that contains this PR
14:50 | Enabled query priority config
15:45 | Disabled query priority config without removing the priority definitions

Checklist

  • Tests updated
  • Documentation added
  • CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]

@justinjung04 justinjung04 force-pushed the scheduler-improvement branch 2 times, most recently from 9a9275c to 5a57717 Compare October 17, 2023 21:09
@justinjung04 justinjung04 marked this pull request as ready for review October 19, 2023 04:26
@justinjung04 justinjung04 force-pushed the scheduler-improvement branch from fc1260d to c75a0ca Compare October 19, 2023 17:06
@justinjung04
Copy link
Contributor Author

Since this is a big change, I'm running test in my dev env as well. Once I have the result I'll share it here.

@yeya24
Copy link
Contributor

yeya24 commented Oct 22, 2023

Can you please add an E2E test case for this feature? It would be good to show that the reserved querier only processes high priority query

pkg/scheduler/schedulerpb/scheduler.proto Outdated Show resolved Hide resolved
pkg/util/query/priority_test.go Outdated Show resolved Hide resolved
pkg/scheduler/queue/queue.go Outdated Show resolved Hide resolved
pkg/scheduler/queue/queue.go Outdated Show resolved Hide resolved
pkg/scheduler/queue/user_queues.go Outdated Show resolved Hide resolved
pkg/frontend/v2/frontend.go Outdated Show resolved Hide resolved
pkg/util/validation/limits.go Outdated Show resolved Hide resolved
pkg/frontend/v2/frontend.go Outdated Show resolved Hide resolved
@alanprot
Copy link
Member

I know we are using a chan now as a queue.. but why not use a real priority queue instead of 2 queue so we can have any number of priorities (instead just HIGH and LOW)

I would also have try to have some kind of heuristic to set this priority? setting this manually seems not scalable.

@justinjung04 justinjung04 force-pushed the scheduler-improvement branch from fc9351a to 20692fa Compare October 30, 2023 21:17
@justinjung04 justinjung04 marked this pull request as draft November 1, 2023 18:08
@justinjung04 justinjung04 force-pushed the scheduler-improvement branch from 280dbab to 6044ec7 Compare November 1, 2023 18:58
@justinjung04 justinjung04 changed the title High priority query queue Priority query Nov 7, 2023
@justinjung04 justinjung04 changed the title Priority query Query priority Nov 8, 2023
@justinjung04 justinjung04 force-pushed the scheduler-improvement branch from 09610c2 to 17bdef6 Compare November 9, 2023 00:47
@justinjung04 justinjung04 marked this pull request as ready for review November 10, 2023 20:50
@justinjung04
Copy link
Contributor Author

Major upgrade completed, @yeya24 and @alanprot could you take a look?

Meanwhile I'll run beta test to check the followings:

  • Confirm high priority requests get dequeued more than lower priority requests
  • Confirm reserved queriers do not pick up low priority requests
  • Confirm the tenant limit config change gets picked up by query frontend and scheduler without the pods restarting
  • Check if memory usage changes significantly when using priority queue

@justinjung04 justinjung04 force-pushed the scheduler-improvement branch 2 times, most recently from 5d0746f to ededdc2 Compare November 15, 2023 12:59
@justinjung04
Copy link
Contributor Author

justinjung04 commented Nov 15, 2023

E2E testing completed and the result looks awesome. Added to the PR description.

queue queue
lengthGauge prometheus.Gauge
}

// Op is an operation on the priority queue.
type Op interface {
Key() string
Copy link
Contributor Author

@justinjung04 justinjung04 Nov 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generalizing PriorityQueue by removing the Key and duplicate key check. Confirmed that this code is not used elsewhere in the codebase.

Copy link
Contributor

@yeya24 yeya24 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Overall I think the pr is in the right direction and in good shape. I have added some comments.

pkg/util/query/priority.go Outdated Show resolved Hide resolved
pkg/util/query/priority.go Outdated Show resolved Hide resolved
pkg/scheduler/queue/user_request_queue.go Outdated Show resolved Hide resolved
pkg/scheduler/queue/user_request_queue.go Outdated Show resolved Hide resolved
pkg/frontend/v2/frontend.go Outdated Show resolved Hide resolved
pkg/frontend/v2/frontend_test.go Outdated Show resolved Hide resolved
pkg/util/validation/limits.go Outdated Show resolved Hide resolved
pkg/util/validation/limits.go Outdated Show resolved Hide resolved
pkg/util/query/priority.go Outdated Show resolved Hide resolved
pkg/frontend/transport/roundtripper.go Outdated Show resolved Hide resolved
Signed-off-by: Justin Jung <[email protected]>
Signed-off-by: Justin Jung <[email protected]>
Signed-off-by: Justin Jung <[email protected]>
Signed-off-by: Justin Jung <[email protected]>
…auge vector in user request queue

Signed-off-by: Justin Jung <[email protected]>
Signed-off-by: Justin Jung <[email protected]>
Signed-off-by: Justin Jung <[email protected]>
Signed-off-by: Justin Jung <[email protected]>
Signed-off-by: Justin Jung <[email protected]>
pkg/util/validation/limits.go Outdated Show resolved Hide resolved
pkg/frontend/transport/handler.go Outdated Show resolved Hide resolved
pkg/querier/tripperware/instantquery/instant_query.go Outdated Show resolved Hide resolved
pkg/frontend/transport/handler.go Show resolved Hide resolved
Signed-off-by: Justin Jung <[email protected]>
@damnever
Copy link
Contributor

We should also consider this scenario: a large number of high-priority requests could potentially cause lower-priority ones to time out.

@justinjung04
Copy link
Contributor Author

justinjung04 commented Nov 29, 2023

We should also consider this scenario: a large number of high-priority requests could potentially cause lower-priority ones to time out.

Yes, I wanted to make this PR as a preliminary feature that assumes small list of priorities and querier scale up if too many queries are timing out.

I agree that this feature should be improved by introducing fairness logic between different priorities, but wanted to keep the initial PR simple as the code change involved with introducing priority itself seemed big enough.

Signed-off-by: Justin Jung <[email protected]>
Signed-off-by: Justin Jung <[email protected]>
Signed-off-by: Justin Jung <[email protected]>
Signed-off-by: Justin Jung <[email protected]>
Signed-off-by: Justin Jung <[email protected]>
Copy link
Contributor

@yeya24 yeya24 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Awesome work

@justinjung04
Copy link
Contributor Author

Performed another end-to-end test, and things are still working as expected.

@yeya24 yeya24 merged commit e85a331 into cortexproject:master Nov 30, 2023
14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants