-
Notifications
You must be signed in to change notification settings - Fork 69
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Memberlist: support for debouncing notifications #592
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you, PR looks fine to me. Tests that do not initialize UpdateInterval
are now failing.
I think the change is safe to enable it by default, without the need to go back to previous behaviour.
Alternative to this would be implementing rate-limiting in memberlist KV client's WatchKey/WatchPrefix implementations, which would automatically cover all rings, but only benefit memberlist users. (However we would like to get rid of supporting other KV stores eventually, so that's not a big deal)
Co-authored-by: Peter Štibraný <[email protected]>
I vendored this change locally and it actually does break tests in Mimir, as ruler tests attempt to wait 100ms for ring changes to propagate. I may just bite the bullet and make this change a no-op by default (synchronous propagation) and make the added delay configurable. @pstibrany let me know if you have thoughts. |
/find-flaky-tests |
I'm fine with having it disabled by default. One previously mentioned alternative would be to move this to memberlist KV store -- not only would it apply to all rings, but clients of memberlist KV store already expect updates to be delayed. WDYT? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you, I'm fine with keeping this disabled by default. Once we
I had taken a cursory look at this earlier and it looked like I've have to pepper a bunch of ad-hoc timer code into the bodies of WatchKey/Prefix. But today I gave it another look and realized it could cleanly go in the memberlist notification handling before the notifs make it to the Watch methods. I like this better! PTAL in your limited time. :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very nice! Thank you!
What this PR does:
NotifyInterval
, at which point it will deliver notifications using the most recently-observed data.-memberlist.notify-interval
which defaults to 0 (off).Motivation for this change:
In clusters where the memberlist KVStore watched by Ring has many replicas, redeploying those replicas can cause
WatchKey
andupdateRingState
to be called hundreds of times per second. When there are many concurrent goroutines callingring.ShuffleShard
, the high rate ofupdateRingState
calls (which take locks and clear caches) can create heavy lock contention and latency asShuffleShard
attempts to take locks in order to repopulate those caches.Checklist
CHANGELOG.md
updated - the order of entries should be[CHANGE]
,[FEATURE]
,[ENHANCEMENT]
,[BUGFIX]