wal: cyclic redundancy check(CRC) enhanced write ahead log(WAL) #543
mattisonchao
started this conversation in
Proposal
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Background Knowledge
Motivation
In modern distributed databases, the integrity and durability of data are paramount. Systems handling critical operations require guarantees that data will persist even during system failures. Write-Ahead Logs (WALs) have emerged as a vital technique for ensuring these guarantees, providing a mechanism to log changes before they are applied to the system's primary data store.
However, despite their reliability, WALs can still be vulnerable to data corruption due to hardware failures, incomplete writes, or unexpected system crashes. These issues may result in subtle, undetected data corruption that undermines the integrity of the log and the overall system.
A CRC-enhanced WAL addresses these challenges by incorporating a lightweight and efficient mechanism to ensure log integrity. Cyclic Redundancy Check (CRC) is a widely used algorithm to detect accidental changes to raw data. Adding a CRC checksum to each log entry in the WAL can dramatically improve the system's ability to detect data corruption and ensure safe recovery after crashes.
Goals
In Scope
Design
Background
The image above illustrates the structure of the Oxia server's Write-Ahead Log (WAL), which comprises segments.
Every segment has two files for different purposes.
The current record format is pretty straightforward.
The index format stores data into a file directly and splits them by 4 Bytes to indicate the logic record index's file offset.
Implementation
The new version of the write-ahead log will support cyclic redundancy check(CRC) for both txn and idx format. We won't change the segment concept and segment organisation. Therefore, we can reuse most of the existing logic.
Chained cyclic redundancy check(CRC)
This proposal expects the use of the chained cyclic redundancy check(CRC). Which can help quickly validate the same write-ahead log across nodes. For example, we can use the CRC to validate if the oxia shard follower's whole WAL is precisely the same as the leader's. We can also perform some operations when there is an inconsistency.
The image above illustrates the structure of the chained CRC concept.
Txn format support
We need to add the CRC part to store the checksum of CRC for validation to accomplish the chained CRC purpose. Also, we need to add another Previous CRC part for faster validation and to help eliminate the firm reliance on previous records for validation. Even the previous record was trimmed or corrupted.
Idx format support
We also need to add the CRC for the index file to help validate whether the current index is valid.
Corruption Handling
Errors
Currently, if we encounter data corruption, there are three errors we might get when we read data.
payloadSize
is out of file bounds.payloadSize
, which is unacceptable for WAL.Idx corrupted
If the idx is corrupted, we can discard it and rebuild the index through the txn file.
Txn corrupted
uncommitted data corrupted
We can also discard uncommitted records if the uncommitted WAL data is corrupted, as we do not have a guarantee on uncommitted data.
committed data corrupted
todo...
Backward & Forward Compatibility
We are using the file extension to support compatibility. Every version of
codec
should have its own txn and idx file extension. For example:v1 codec
.txn
.idx
v2 codec
.txnx
.idxx
The
codec
should be segment-level. The server will iterate all the versions of the file extension to find the correct one. The default is the latest version.Beta Was this translation helpful? Give feedback.
All reactions