Expose calculate_token #658

Ten0 · 2023-03-11T23:26:31Z

Helps (or arguably fixes) #468

For smarter batch constitution (following up on #448), it is required to have access to the token a particular statement would be directed to. However, that is pretty difficult to do without access to calculate_token.

That was initially put in #508, but planned to put in a separate PR to keep it minimal
(#508 (comment))
It seems I had forgotten to open that separate PR, and I end up getting bitten by that now that I want to constitute my batches in a smarter way. ^^'

So here it is.

I'm only putting "helps" above because I think we may want to also expose query planning (
session.plan(prepared_statement, serialized_values) -> impl Iterator<Item = (Arc<Node>, ShardID)> ) as that may make it significantly easier on the user to obtain the relevant keys for batching - but I'd like to keep this PR that just enables the ideal behavior as simple as possible.

Pre-review checklist

I have split my patch into logically separate commits.
All commit messages clearly explain what they change and why.
I added relevant tests for new features and bug fixes.
All commits compile, pass static checks and pass test.
PR description sums up the changes and reasons why they should be introduced.
I added appropriate Fixes: annotations to PR description.

Helps (or arguably fixes) scylladb#468 For smarter batch constitution (following up on scylladb#448), it is required to have access to the token a particular statement would be directed to. However, that is pretty difficult to do without access to calculate_token. That was initially put in scylladb#508, but planned to put in a separate pr to keep it minimal (scylladb#508 (comment)) It seems I had forgotten to open that separate PR, and I end up getting bitten by that now that I want to constitute my batches in a smarter way. So here it is. I'm only putting "helps" above because I think we may want to also expose query planning ( `session.plan(prepared_statement, serialized_values) -> impl Iterator<Item = (Arc<Node>, ShardID)>` ) as that may make it significantly easier - but I'd like to keep this PR that just *enables* the ideal behavior as simple as possible.

Resolves scylladb#468 This is a follow-up on scylladb#508 and scylladb#658: - To minimize CPU usage related to network operations when inserting a very large number of lines, it is relevant to batch. - To batch in the most efficient manner, these batches have to be shard-aware. Since scylladb#508, `batch` will pick the shard of the first statement to send the query to. However it is left to the user to constitute the batches in such a way that the target shard is the same for all the elements of the batch. - This was made *possible* by scylladb#658, but it was still very boilerplate-ish. I was waiting for scylladb#612 to be merged (amazing work btw! 😃) to implement a more direct and factored API (as that would use it). - This new ~`Session::first_shard_for_statement(self, &PreparedStatement, &SerializedValues) -> Option<(Node, Option<Shard>)>` makes shard-aware batching easy on the users, by providing access to the first node and shard of the query plan.

Ten0 mentioned this pull request Mar 11, 2023

Exposing Shard ID publicly #468

Open

piodul approved these changes Mar 20, 2023

View reviewed changes

piodul merged commit ba5d06f into scylladb:main Mar 20, 2023

Ten0 mentioned this pull request Jun 3, 2023

Shard aware batching - add Session::shard_for_statement & Batch::enforce_target_node #738

Open

8 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Expose calculate_token #658

Expose calculate_token #658

Ten0 commented Mar 11, 2023

Expose calculate_token #658

Expose calculate_token #658

Conversation

Ten0 commented Mar 11, 2023

Pre-review checklist