Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

db: ignore writes below new block transaction on recovery #864

Merged
merged 2 commits into from
May 17, 2024

Conversation

asubiotto
Copy link
Member

Previously, the recovery code was ignoring writes below the persist transaction of a table block. However, this could lead to dropped writes since the active block is swapped atomically on rotation before the old block is persisted. Writes in between these two events would be written to the new table block, but ignored on recovery given the recovery code assumed they were in the old table block.

@asubiotto
Copy link
Member Author

Data race in snapshot on recovery test. Will fix

Copy link
Contributor

@thorfour thorfour left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@asubiotto asubiotto force-pushed the alfonso-drotation branch 3 times, most recently from 3465731 to 8247efd Compare May 14, 2024 11:41
@asubiotto
Copy link
Member Author

Race should be fixed. I ended up having to store the next non-persisted txn in the wal proto, given the replay semantics had to otherwise be slightly modified with code to deduce this txn, so I decided it was easier to just store and retrieve it with minimal changes to the recovery code.

I also added a second commit with a fix for duplicate writes found by DST due to an off-by-one WAL error.

Previously, the recovery code was ignoring writes below the persist transaction
of a table block. However, this could lead to dropped writes since the active
block is swapped atomically on rotation *before* the old block is persisted.
Writes in between these two events would be written to the new table block, but
ignored on recovery given the recovery code assumed they were in the old table
block.
Previously snapshots were performed using a write txn and the WAL was truncated
so that that txn would be the first txn in the truncated WAL. However,
snapshots were changed to use a read txn so if a write at txn k was included
in the snapshot, a truncated WAL would still contain this write as the first
write in the WAL, resulting in duplicate data after recovery.
@asubiotto asubiotto merged commit 2c5b58b into main May 17, 2024
9 checks passed
@asubiotto asubiotto deleted the alfonso-drotation branch May 17, 2024 09:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants