Token aware load balancing for batches #448

wyfo · 2022-05-12T12:26:12Z

Currently, batches don't support token aware load balancing. However, there could be situations where batches statement are guaranteed to be processed by the same nodes. Enabling token aware load balancing would then improve performance.

A possible implementation could be to add an additional configuration method, for example set_compute_token, and when the flag is set, the token of the batch's first prepared statement is computed to to be used in load balancing policy.

Any opinion about it? I'm volunteer to make the PR.

The text was updated successfully, but these errors were encountered:

psarna · 2022-05-12T13:39:22Z

Oh, I thought we already speculatively do that (take the first prepared statement if available and use its token). I think we should even consider doing that unconditionally, since it's good practice to only batch statements within a single partition - in which case the user will be rewarded for their good practice with token awareness for free. Then we don't even need any set_compute_token, and the only downside is that batch statements with mixed partitions go to the node which owns the first statement - which is still a good idea, at least part of the batch application will be local.

Contributions welcome! I'll take liberty of assigning you to this one, so everyone knows it's taken. Thanks!

Ten0 · 2022-06-05T21:36:32Z

I thought we already speculatively do that

we should even consider doing that unconditionally

As a new user, I don't understand how speculative isn't unconditional. Could you elaborate? Or do you mean we actually currently don't already do that?

Thanks,

psarna · 2022-06-06T14:01:08Z

Now that I read it, I think I misused the word "speculatively", "optimistically" would have been better. I meant the decision to be unconditional - we optimistically assume that the batch only consists of a single partition, so we just extract token info from the first statement.

Ten0 · 2022-06-06T16:02:58Z

So just to be sure I understand correctly, this is not currently implemented but is the plan?

wyfo · 2022-06-06T20:12:02Z

So just to be sure I understand correctly, this is not currently implemented but is the plan?

Indeed, looking at the code, it's not implemented.
I had planned to do it but I also had a small accident and I'm now a little bit less productive with a cast on my wrist ...
So I think I should unassign me for now, until I recover.

Ten0 · 2022-08-13T13:33:20Z

This documentation seems to say that ideal performance would be attained by grouping together queries that end up in the same partition.

However it looks like since:

scylla-rust-driver/scylla/src/transport/session.rs

Line 1318 in 71889a0

fn calculate_token(

is not public (#468)
that may actually be impossible right now.

It looks then like as part of implementing this feature, this should also be made public.

The new code, and in particualr the fancy GAT workaroound, cause problems when trying to pass batch values by reference, as described in scylladb#568. Fixes: scylladb#568 Unfixes: scylladb#448 This reverts commit 24ee954. Signed-off-by: Jan Ciolek <[email protected]>

cvybhu · 2022-10-04T16:05:47Z

Reopening because of #569

Helps (or arguably fixes) scylladb#468 For smarter batch constitution (following up on scylladb#448), it is required to have access to the token a particular statement would be directed to. However, that is pretty difficult to do without access to calculate_token. That was initially put in scylladb#508, but planned to put in a separate pr to keep it minimal (scylladb#508 (comment)) It seems I had forgotten to open that separate PR, and I end up getting bitten by that now that I want to constitute my batches in a smarter way. So here it is. I'm only putting "helps" above because I think we may want to also expose query planning ( `session.plan(prepared_statement, serialized_values) -> impl Iterator<Item = (Arc<Node>, ShardID)>` ) as that may make it significantly easier - but I'd like to keep this PR that just *enables* the ideal behavior as simple as possible.

psarna assigned wyfo May 12, 2022

wyfo removed their assignment Jun 6, 2022

junglie85 mentioned this issue Jul 8, 2022

Exposing Shard ID publicly #468

Open

Ten0 mentioned this issue Aug 14, 2022

Pick connections based on batch first statement's shard #508

Merged

6 tasks

cvybhu closed this as completed in #508 Sep 30, 2022

cvybhu mentioned this issue Oct 4, 2022

Passing batch values by reference doesn't work #568

Closed

cvybhu mentioned this issue Oct 4, 2022

Revert "Pick connections based on batch first statement's shard" #569

Merged

6 tasks

cvybhu reopened this Oct 4, 2022

Ten0 mentioned this issue Oct 5, 2022

Pick connections based on batch first statement's shard - v2 #573

Merged

6 tasks

piodul closed this as completed in #573 Nov 15, 2022

Ten0 mentioned this issue Mar 11, 2023

Expose calculate_token #658

Merged

6 tasks

Ten0 mentioned this issue Jul 2, 2023

Shard aware batching - add Session::shard_for_statement & Batch::enforce_target_node #738

Open

8 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Token aware load balancing for batches #448

Token aware load balancing for batches #448

wyfo commented May 12, 2022

psarna commented May 12, 2022 •

edited

Loading

Ten0 commented Jun 5, 2022 •

edited

Loading

psarna commented Jun 6, 2022

Ten0 commented Jun 6, 2022

wyfo commented Jun 6, 2022 •

edited

Loading

Ten0 commented Aug 13, 2022

cvybhu commented Oct 4, 2022

Token aware load balancing for batches #448

Token aware load balancing for batches #448

Comments

wyfo commented May 12, 2022

psarna commented May 12, 2022 • edited Loading

Ten0 commented Jun 5, 2022 • edited Loading

psarna commented Jun 6, 2022

Ten0 commented Jun 6, 2022

wyfo commented Jun 6, 2022 • edited Loading

Ten0 commented Aug 13, 2022

cvybhu commented Oct 4, 2022

psarna commented May 12, 2022 •

edited

Loading

Ten0 commented Jun 5, 2022 •

edited

Loading

wyfo commented Jun 6, 2022 •

edited

Loading