-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Restore improvement: index #4044
Conversation
db5c9ec
to
1f15c79
Compare
Right now SM restores location by location, manifest by manifest, table by table. That's why it tracks restore progress by keeping location/manifest/table in the DB. We are moving away from sequential restore approach in favor of restoring from all locations/manifests/tables at the same time.
1f15c79
to
bc04376
Compare
…dFiles There is no need to iterate versioned files (ListVersionedFiles) and not versioned files (buildFilesSizesCache) separately. Doing it in a single iteration is faster, and it allows to store all size information in a single place.
bc04376
to
1762b02
Compare
@karol-kokoszka This PR is ready for review! |
General note - I'm lacking some documentation here. For example, how do we sort them (if we do) after indexing? do we have any way to control it, etc. |
We have just the document (design doc?) describing the restore in Scylla Manager 3.1. I agree that all the details that are going to be included into 3.4 must be documented. We can either make it public, by editing README.md of https://github.com/scylladb/scylla-manager/tree/master/pkg/service/restore and putting reference to it in main README, or by just preparing new design doc documenting all the changes/updates or current state. The design, list of changes were introduced on our meeting where we wanted to understand the bottlenecks from the current implementation. We haven't created formal design doc.... but just set of issues. Issue for the documentation #4051 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
@karol-kokoszka I addressed all the comments, please take a look. |
97f9642
to
0efed73
Compare
Please check why Sanity check fails. |
Because of the known flake: #3848 |
This commit introduces the structure of restore workload. Workload is divided per location->table->remote sstable directory. This changes the hierarchy established by manifests (location->node->table->remote sstable dir). It also aggregates files into actual sstables, extracts their IDs, and aggregates their sizes, and keeps track of sstable versioning.
Indexed workload won't contain sstables that were already restored during previous restore run.
This is a temporary implementation used for integrating workload indexing with the rest of the code. It will be improved as a part of the #3979.
This commit makes use of the new indexing and batching approaches and uses them in the restore tables codebase.
Recent commits changed versioned batch download so that if any sstable component is versioned, then all sstable components are downloaded as versioned files. It was done in that way to allow easier versioned progress calculation (we don't store per file size, only the whole sstable size). This brought to light a bug (that existed before, but was more difficult to hit), in which restoring batch failed when the whole batch was versioned, as calling RcloneSyncCopyPaths on empty paths parameter resulted in broken download. We could just skip the RcloneSyncCopyPaths call when the whole batch is versioned, but this would leave us without the agentJobID which is a part of sort key in RestoreRunProgress. Without it, we could potentially overwrite one restore run progress with another - if both of them happened on the same RemoteSSTableDir, by the same Host, and were fully versioned. It would also introduce a different path for restoring regular batch and fully versioned batch, which is not desirable. That's why I decided to modify rclone server to allow empty path parameter, so that it still generates agentJobID, but it doesn't do anything except for that.
Workload info contains location/table/remote sstable dir sstable count, total size, max and average sstable size.
0efed73
to
503bd5c
Compare
This PR is the core of restore improvements prepared for SM 3.4.
It refactors the general workflow from restoring location by location, manifest by manifest, table by table, to a version where nothing waits on nothing! All of the remote sstable dirs are indexed at the beginning, fed to batch dispatcher, which controls batch creation. For now, the batch dispatcher creates batches in the same way as before, but it will be imporved as a part of #3979.
Rebased on #4040.
Fixes #3981