Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

transaction_async is not cancel-safe or panic-safe, and bedlam ensues #47

Open
davepacheco opened this issue Jun 22, 2023 · 2 comments · May be fixed by #81
Open

transaction_async is not cancel-safe or panic-safe, and bedlam ensues #47

davepacheco opened this issue Jun 22, 2023 · 2 comments · May be fixed by #81
Assignees
Milestone

Comments

@davepacheco
Copy link
Collaborator

I think transaction_async was ported to async from similar synchronous code. But it looks to me like it's not cancel-safe because if it is cancelled at this await point then we will wind up having executed BEGIN TRANSACTION; and then putting the connection back in the pool. The result of this is surprisingly terrible: any future operation (using transaction_async() or not) may wind up getting this connection, and it may wind up successfully executing any number of SQL statements (including UPDATEs and SELECTs) that appear to work normally, but none of the changes are reflected in the database because they're being made in a not-yet-committed transaction. This can go on indefinitely -- any number of statements over any amount of time. The connection is essentially "poisoned" in a way that updates are going to /dev/null except that reads (on this one connection only) do reflect those changes.

I have reproduced this with a (relatively) simple demo.

@davepacheco
Copy link
Collaborator Author

Note that this is in practice much less likely to be a problem for Omicron after oxidecomputer/dropshot#701.

@davepacheco davepacheco added this to the MVP milestone Jun 22, 2023
@davepacheco davepacheco changed the title transaction_async is not cancel-safe, and bedlam ensues transaction_async is not cancel-safe or panic-safe, and bedlam ensues Oct 23, 2024
@davepacheco
Copy link
Collaborator Author

It appears that @jmpesp saw a similar failure mode where the function passed to transaction_async panicked (rather than being cancelled). The result is that the connection was returned to qorb still inside a transaction. This is slightly different than being cancelled, but the effect is the same (a connection escapes without being reset to its nominal state) and a possible fix could be the same (having a Drop handler deal with this).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants