-
Notifications
You must be signed in to change notification settings - Fork 544
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sharding operators: topk / bottomk #7582
Conversation
7ed0ff5
to
8efab08
Compare
The CHANGELOG has just been cut to prepare for the next Mimir release. Please rebase |
Done. |
hello? |
Hi @wanghaao, thanks for your contribution. This is a feature we had an internally discussion earlier, but it is not as straight forward as just shard the topk. Here is why originally from @pracucci internally, I would copy it here.
So in order to have the topk after concat, we should have K + B (a buffer) instead of K in the subquery, but I don't have a good mathematic equation to get that B on top of my head, since it would related to how even the data are distributed. That being said we won't accept this PR, but if you figure out a nice way to solve ⬆️ , feel free to comment here and update the PR. |
There's no buffer that could solve if there's a sum aggregation inside of topk. topk and bottomk can be sharded if its expression is a plain vector selector. My intuition tells me that |
What this PR does
Sharding topk/bottomk operators