-
Notifications
You must be signed in to change notification settings - Fork 95
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add backup benchmarking under read stress #9307
Conversation
91974c9
to
f0c02f5
Compare
I have to admit it looks super suspicious that the read did not degrade even slightly |
@kreuzerkrieg There was some refactoring done for Manager tests. |
f0c02f5
to
1c6cd01
Compare
1c6cd01
to
8ee7444
Compare
rebased and moved the test to another class |
8ee7444
to
0d922c4
Compare
mgmt_cli_test.py
Outdated
backup_task_status = task.wait_for_uploading_stage(timeout=200000) | ||
assert backup_task_status, "Backup has failed!" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd recommend to use task.wait_for_status(...)
.
In such case there is no need in assertion - the waiter fails by itself if timeout pass.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what do you mean? the wait_for_uploading_stage
calls task.wait_for_status
and it returns bool
then I'll have to assert on is_status_reached
or something
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm expecting task.wait_for_status
raise WaitForTimeoutError
exception if timeout reaches.
But whatever, the comment can be ignored, not significant at all.
Btw, your code might be a bit shortened:
assert task.wait_for_uploading_stage(timeout=200000), "Backup has failed!"
From my point of view, it improves readability but it's up to you.
The units in the issue should be |
8c6061d
to
886e8cd
Compare
660fa90
to
b16c2d4
Compare
It is the way it is presented, I think, I do send seconds to it and it presents it as a time |
From the code I see that Also, why |
Should I name it "snapshot time"?
Good question, the total time is taken from backup |
It's not only about snapshot (e.g. we also fetch schema, create manifests, etc...). I created and issue about displaying backup upload bandwidth/duration - in the future it could make things easier for such benchmarks. |
Now when I see that it is negligible slice of the whole process I guess it is worth just dropping it |
ae22f65
to
96cecc7
Compare
96cecc7
to
4882bf3
Compare
Add `test_backup_benchmark` test which measures backup time under read stress test
4882bf3
to
359bdca
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@scylladb/qa-maintainers could you push it forward please? |
I briefly looked at 100gb dataset - for me ingestion of this size should take bit more than 7m - did you verify correctness of this element? |
ok, I did bit wrong calculations at the first time, looks 7m should be enough with this throughput. |
Introduce the
test_backup_benchmark
test, which measures backup time under read stress conditions. This test performs multiple actions to consolidate all necessary data into a single table. Initially, it runs and measures the backup process, followed by the read stress test. Finally, it executes both processes asynchronously to observe how the performance of reading and backing up degrades.More metrics could be added after scylladb/scylla-manager#4123 is completed
Test should be re-ran after scylladb/scylla-manager#4125 is completed
Argus results:
For 100GB run
fixes: #8752