-
Notifications
You must be signed in to change notification settings - Fork 590
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
perf(storage): simplify table watermark index #15931
Conversation
Added benchmark. Benchmark result is as followed. The test setting is, 500 epochs between safe epoch and committed epoch, and 500 staging epochs. Vnodes are partitioned into 16 parts. This branch
main branch
The time for For query, the time to query the latest epoch is 85% lower than the main branch, and the time to query a middle epoch is 7 times the main branch. This is because the implementation is linearly searching from latest epoch to old epoch, and therefore is't biased for querying the watermark in the latest epoch. In conclusion, compared to the main branch, this PR save the significant time to build and maintain the table watermark index. In terms of query, this branch is even faster than the main branch in querying the latest epoch. It only performs poorly in querying a mvcc epoch, which is not a common use case. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LSTM! The micro-bench result looks good! Thanks for the quick fix.
.cloned() | ||
}) | ||
// iterate from new epoch to old epoch | ||
for (watermark_epoch, vnode_watermark_list) in self.staging_watermarks.iter().rev().chain( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the time to query a middle epoch is 7 times the main branch. This is because the implementation is linearly searching from latest epoch to old epoch, and therefore is't biased for querying the watermark in the latest epoch
This is a note only: I guess we can use binary search to make query from non-latest epoch faster but given that only internal state table contains table watermark and internal state table reads are all latest, we don't need to worry about that right now.
…16173) Co-authored-by: William Wen <[email protected]>
I hereby agree to the terms of the RisingWave Labs, Inc. Contributor License Agreement.
What's changed and what's your intention?
Avoid building per vnode table watermark index.
Checklist
./risedev check
(or alias,./risedev c
)Documentation
Release note
If this PR includes changes that directly affect users or other significant modifications relevant to the community, kindly draft a release note to provide a concise summary of these changes. Please prioritize highlighting the impact these changes will have on users.