Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug(compaction): Unable to trigger split in time, when barrier latency is high #15291

Open
Li0k opened this issue Feb 27, 2024 · 4 comments
Open
Assignees
Labels
no-issue-activity type/bug Something isn't working
Milestone

Comments

@Li0k
Copy link
Contributor

Li0k commented Feb 27, 2024

Describe the bug

In Hummock, the decision to split a compaction group is made by counting the flush throughput of the table.

async fn on_handle_check_split_multi_group(&self) {

To minimize the effects of jitter, we introduce the concept of window_size to make the statistics more accurate and add new statistics to the window at each commit_epoch. https://github.com/risingwavelabs/risingwave/blob/41f4ad55c636836fc9c7f7860ada535e26dbd6ca/src/meta/src/hummock/manager/mod.rs# L1779

Recently, we found that when a Barrier contains a large amount of data, we can't update the statistical information in time (affected by the barrier latency), and thus can't trigger the split in time.

Error message/log

No response

To Reproduce

No response

Expected behavior

No response

How did you deploy RisingWave?

No response

The version of RisingWave

No response

Additional context

No response

@Li0k Li0k added the type/bug Something isn't working label Feb 27, 2024
@github-actions github-actions bot added this to the release-1.7 milestone Feb 27, 2024
@Li0k
Copy link
Contributor Author

Li0k commented Feb 27, 2024

I'm assuming that the write amplification within cg2 / cg3 is still due to the data misalignment factor.
It doesn't seem reasonable to perform a split directly during the new table creation or recovery phase. (We don't support merge at the moment).

I prefer to do some data analysis in the flush phase and perform a split on the SST to promote boundary alignment.

@Little-Wallace @zwang28 @hzxa21

@hzxa21
Copy link
Collaborator

hzxa21 commented Feb 27, 2024

I prefer to do some data analysis in the flush phase and perform a split on the SST to promote boundary alignment.

By split you mean putting data related to specific table ids in separate SSTs, not splitting compaction group, right?

If that is the case, is this a permanent change (applied to all future data related to these tables) or a temporary change (only applied to data related to these tables in some period)?

Copy link
Contributor

This issue has been open for 60 days with no activity. Could you please update the status? Feel free to continue discussion or close as not planned.

@hzxa21
Copy link
Collaborator

hzxa21 commented Oct 8, 2024

I think the new split strategy (WIP) can resolve this issue, right? cc @Li0k

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
no-issue-activity type/bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants