-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allowing entries containing transactions with conflicting accounts in Banking stage #33899
Comments
Thanks for posting the issue @talalim; much less likely to get lost here. This week is a little busy with Breakpoint going on, but I will try to refresh myself on the matter and give an answer next week |
thanks @steviez |
@talalim, thanks for creating an issue. This is something that has been on the map for a while, but held back by other work. WRT to actual implementation, I don't think it's as simple as described here.
We currently do re-batching on the entries we are processing. If we removed the re-batching and used entries directly, we can guarantee that we are processing the entries (potentially self-conflicting, but not conflicting with eachother) and there are no issues with parallel access. Other than that, it's just simply re-working the current implementation. The current batch processing of transactions does not simply loop over transactions, but does sections of work for all transactions. This requires some re-writing of code, since we can no longer load all our accounts up front, since for some shared writable account the state may have changed. All that's not to say this shouldn't be done; it should. Just giving an overview of how I've viewed the problem, and why it hasn't been done yet. |
Created a SIMD here: solana-foundation/solana-improvement-documents#83 |
Hey @apfitzge
We can make a PR for these changes which allows all of this for community to review this. |
please add me huzaifa.rehman#2212 and talal.irfan#3032 to proj-tx-scheduler as we are actively working on this problem and have obtained some exciting results |
TVU handles two distinct scenarios in process_entries()(Link): Flow 1: Re-batching Conditions After either of the above mentioned paths, each batch is allocated to an individual Rayon thread [link]. Subsequent to this allocation, every transaction within an entry is executed sequentially within its assigned Rayon thread. The challenge arises when self-conflicting batches occur within the first flow. Breaking them into smaller batches shuffles the order and prevents their parallel execution within Rayon threads. Another challenge for entries with conflicting txns arises in execution as account state is not carried and updated from txn to txn. This needs to be done for the duration of execution of an entry in temp state to ensure correct execution (balances manitained correctly) until the entry is committed. To address this, the PR [link] introduces some straightforward changes to the rebatching logic and account state logic during execution. These modifications aim to allow for the parallel execution of self-conflicting batches within Rayon threads. I have tested this with a multi-node test by enabling batches containing conflicting txns from the TPU side and setting up bench_tps to create conflicitng/interdependent txns. |
@Huzaifa696 - Thanks for your continued interest / efforts on this problem! As mentioned in this comment: The order we'll follow is Chatting with Andrew, he pointed out that there could be some changes made that don't necessarily impact consensus / require a feature gate, but if we're going to potentially update this piece, I think it makes sense to reach agreement first and then plan the agreed upon changes accordingly |
Problem
TPU has to make entries using only “disjoint” transactions because TVU makes some overly strict assumptions, i.e:
Proposed Solution
We've figured out that if we allow transactions with conflicting accounts in an entry for a flexible scheduling of transactions by changing the locking function in a way that it allows transactions with conflicting accounts in an entry, It doesn't break the execution of transactions on replay side. As it parallelize entries for execution and not individual transactions in an entry.
It looks like the way locking is implemented in execution the assumption is that the replay side will execute individual transactions in parallel, which doesn't seem to be the case.
The only thing that can break from our exploration is the rebatching logic in TVU, which can also be fixed by changing it a little e.g acquiring locks after rebatching is done.
We think if this overstrict requirement of making disjoint entries can be changed to allow entries with transactions of conflicting accounts, the performance of the banking stage can be improved by scheduling the transactions in a way that CPU utilization can be improved.
Do you guys have anything else in mind which can be affected by allowing entries with conflicting transactions
@steviez @apfitzge @jstarry @buffalu
The text was updated successfully, but these errors were encountered: