Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Token aware load balancing for batches #448

Closed
wyfo opened this issue May 12, 2022 · 7 comments · Fixed by #508 or #573
Closed

Token aware load balancing for batches #448

wyfo opened this issue May 12, 2022 · 7 comments · Fixed by #508 or #573

Comments

@wyfo
Copy link
Contributor

wyfo commented May 12, 2022

Currently, batches don't support token aware load balancing. However, there could be situations where batches statement are guaranteed to be processed by the same nodes. Enabling token aware load balancing would then improve performance.

A possible implementation could be to add an additional configuration method, for example set_compute_token, and when the flag is set, the token of the batch's first prepared statement is computed to to be used in load balancing policy.

Any opinion about it? I'm volunteer to make the PR.

@psarna
Copy link
Contributor

psarna commented May 12, 2022

Oh, I thought we already speculatively do that (take the first prepared statement if available and use its token). I think we should even consider doing that unconditionally, since it's good practice to only batch statements within a single partition - in which case the user will be rewarded for their good practice with token awareness for free. Then we don't even need any set_compute_token, and the only downside is that batch statements with mixed partitions go to the node which owns the first statement - which is still a good idea, at least part of the batch application will be local.

Contributions welcome! I'll take liberty of assigning you to this one, so everyone knows it's taken. Thanks!

@Ten0
Copy link
Contributor

Ten0 commented Jun 5, 2022

I thought we already speculatively do that

we should even consider doing that unconditionally

As a new user, I don't understand how speculative isn't unconditional. Could you elaborate? Or do you mean we actually currently don't already do that?

Thanks,

@psarna
Copy link
Contributor

psarna commented Jun 6, 2022

Now that I read it, I think I misused the word "speculatively", "optimistically" would have been better. I meant the decision to be unconditional - we optimistically assume that the batch only consists of a single partition, so we just extract token info from the first statement.

@Ten0
Copy link
Contributor

Ten0 commented Jun 6, 2022

So just to be sure I understand correctly, this is not currently implemented but is the plan?

@wyfo
Copy link
Contributor Author

wyfo commented Jun 6, 2022

So just to be sure I understand correctly, this is not currently implemented but is the plan?

Indeed, looking at the code, it's not implemented.
I had planned to do it but I also had a small accident and I'm now a little bit less productive with a cast on my wrist ...
So I think I should unassign me for now, until I recover.

@wyfo wyfo removed their assignment Jun 6, 2022
@Ten0
Copy link
Contributor

Ten0 commented Aug 13, 2022

This documentation seems to say that ideal performance would be attained by grouping together queries that end up in the same partition.

However it looks like since:


is not public (#468)
that may actually be impossible right now.

It looks then like as part of implementing this feature, this should also be made public.

cvybhu added a commit to cvybhu/scylla-rust-driver that referenced this issue Oct 4, 2022
The new code, and in particualr the fancy GAT workaroound,
cause problems when trying to pass batch values by reference,
as described in scylladb#568.

Fixes: scylladb#568
Unfixes: scylladb#448

This reverts commit 24ee954.

Signed-off-by: Jan Ciolek <[email protected]>
@cvybhu
Copy link
Contributor

cvybhu commented Oct 4, 2022

Reopening because of #569

@cvybhu cvybhu reopened this Oct 4, 2022
Ten0 added a commit to Ten0/scylla-rust-driver that referenced this issue Mar 11, 2023
Helps (or arguably fixes) scylladb#468

For smarter batch constitution (following up on scylladb#448), it is required
to have access to the token a particular statement would be directed to.
However, that is pretty difficult to do without access to
calculate_token.

That was initially put in scylladb#508, but planned to put in a separate pr to
keep it minimal
(scylladb#508 (comment))

It seems I had forgotten to open that separate PR, and I end up getting
bitten by that now that I want to constitute my batches in a smarter
way.

So here it is.

I'm only putting "helps" above because I think we may want to also
expose query planning (
`session.plan(prepared_statement, serialized_values) -> impl Iterator<Item = (Arc<Node>, ShardID)>`
) as that may make it significantly easier - but I'd like to keep this
PR that just *enables* the ideal behavior as simple as possible.
@Ten0 Ten0 mentioned this issue Mar 11, 2023
6 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants