-
Notifications
You must be signed in to change notification settings - Fork 807
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Don't propagate cancel signal to the Prometheus rules manager context #6326
Don't propagate cancel signal to the Prometheus rules manager context #6326
Conversation
@@ -341,11 +341,15 @@ func DefaultTenantManagerFactory(cfg Config, p Pusher, q storage.Queryable, engi | |||
queryFunc = metricsQueryFunc | |||
} | |||
|
|||
// We let the Prometheus rules manager control the context so that there is a chance | |||
// for graceful shutdown of rules that are still in execution even in case the cortex context is canceled. | |||
prometheusContext := user.InjectOrgID(context.WithoutCancel(ctx), userID) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
til! :D
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TIL
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this was added in go1.21.0
LGTM! |
This change allows the rules that are still executing queries to complete before cortex if sully shutdown. Signed-off-by: Raphael Silva <[email protected]>
65bc9e8
to
6f53f67
Compare
Signed-off-by: Raphael Silva <[email protected]>
Use atomic counter to keep track of the successful queries Signed-off-by: Raphael Silva <[email protected]>
69a93db
to
bd038d5
Compare
…cortexproject#6326) * Don't propagate cancel signal to the Prometheus rules manager context This change allows the rules that are still executing queries to complete before cortex if sully shutdown. Signed-off-by: Raphael Silva <[email protected]> * Make ruler unit tests to run faster Signed-off-by: Raphael Silva <[email protected]> * Avoid tests to fail due to race condition Use atomic counter to keep track of the successful queries Signed-off-by: Raphael Silva <[email protected]> --------- Signed-off-by: Raphael Silva <[email protected]>
What this PR does:
This is done by removing the cancel signal of the context passed to the Prometheus rules manager. Instead we are going to rely on the shutdown mechanism of the rules manager and timeout of the queries executed by the rules.
In the unit tests I tried to avoid adding sleeps and instead used channels to coordinate the execution of the queries to simulate the problem.
Checklist
CHANGELOG.md
updated - the order of entries should be[CHANGE]
,[FEATURE]
,[ENHANCEMENT]
,[BUGFIX]