Skip to content

Commit

Permalink
fix(tee): correct previous fix for race condition in batch locking
Browse files Browse the repository at this point in the history
Commit a7dc0ed (PR #3342) was supposed
to fix a race condition in batch locking by introducing SQL row-locking,
but it didn't work as expected. Now we are switching back to
coarser-grained table-level locking as [originally suggested][1] by
Harald. The original fix was hard to test unless deployed to `stage` due
to the undeterministic nature of the problem, so we needed to merge it
to the `main` branch to properly test it.

[1]: #3342 (comment)
  • Loading branch information
pbeza committed Dec 4, 2024
1 parent a7dc0ed commit a0b49cb
Show file tree
Hide file tree
Showing 2 changed files with 12 additions and 8 deletions.

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

16 changes: 10 additions & 6 deletions core/lib/dal/src/tee_proof_generation_dal.rs
Original file line number Diff line number Diff line change
Expand Up @@ -66,10 +66,16 @@ impl TeeProofGenerationDal<'_, '_> {
let min_batch_number = i64::from(min_batch_number.0);
let mut transaction = self.storage.start_transaction().await?;

// Lock rows in the proof_generation_details table to prevent race conditions. The
// tee_proof_generation_details table does not have corresponding entries yet if this is the
// first time the query is invoked for a batch. Locking rows in proof_generation_details
// ensures that two different TEE prover instances will not try to prove the same batch.
// Lock the entire tee_proof_generation_details table in EXCLUSIVE mode to prevent race
// conditions. Locking the table ensures that two different TEE prover instances will not
// try to prove the same batch.
sqlx::query("LOCK TABLE tee_proof_generation_details IN EXCLUSIVE MODE")
.instrument("lock_batch_for_proving#lock_table")
.execute(&mut transaction)
.await?;

// The tee_proof_generation_details table does not have corresponding entries yet if this is
// the first time the query is invoked for a batch.
let batch_number = sqlx::query!(
r#"
SELECT
Expand All @@ -95,8 +101,6 @@ impl TeeProofGenerationDal<'_, '_> {
)
)
LIMIT 1
FOR UPDATE OF p
SKIP LOCKED
"#,
tee_type.to_string(),
TeeProofGenerationJobStatus::PickedByProver.to_string(),
Expand Down

0 comments on commit a0b49cb

Please sign in to comment.