-
Notifications
You must be signed in to change notification settings - Fork 98
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
block writeable account data limit #184
base: main
Are you sure you want to change the base?
block writeable account data limit #184
Conversation
@bw-solana do we know the current impact if this change was live the past 30 days? |
Good question, I don't know. I'll get some data |
transaction being handled by the block producer (for inclusion in a block) or | ||
validator (replaying transactions in a block), add it to a running total of | ||
total writeable account data for the block, and compare it against some | ||
threshold (recommend 2GB to start) to decide if the transaction is allowed to be |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have statistics on what is the current distribution of writable data on the mainnet?
I wonder if we could start with an even lower value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't have perfect stats, but I posted comment with some analysis below (#184 (comment))
2GiB feels right as a starting point. It should constrain worst case reasonably well without impacting current typical behavior on mainnet. We could ratchet things down (or up) after observing behavior
`MAX_BLOCK_ACCOUNTS_DATA_SIZE_WRITEABLE` and set this to `2_000_000_000` | ||
(2GB). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2GB is 2_147_483_648
:P
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
`MAX_BLOCK_ACCOUNTS_DATA_SIZE_WRITEABLE` and set this to `2_000_000_000` | |
(2GB). | |
`MAX_BLOCK_ACCOUNTS_DATA_SIZE_WRITEABLE` and set this to 2GiB | |
(`2 * 1024 * 1024 * 1024`). |
5. Treat this new error type (`WouldExceedAccountDataWriteableBlockLimit`) as | ||
retryable in execute_and_commit_transactions_locked |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
5. Treat this new error type (`WouldExceedAccountDataWriteableBlockLimit`) as | |
retryable in execute_and_commit_transactions_locked | |
5. Treat this new error type (`WouldExceedAccountDataWriteableBlockLimit`) as | |
retryable in `execute_and_commit_transactions_locked()`. |
There is an edge case where if a single transaction tries to consume more than | ||
the entire per block writeable account data budget, it should not be marked as | ||
retryable. This might not be possible today given tx size limits and maximum | ||
account size limits, but this could be exploitable in the future if either of | ||
this constraints change. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it make sense to introduce a separate constraint that limits the amount of writable data a single transaction can access?
We have MAX_LOADED_ACCOUNTS_DATA_SIZE_BYTES
equal to 64MB which limits the total amount of data a transaction can load.
We could add MAX_LOADED_ACCOUNTS_WRITEABLE_DATA_SIZE_BYTES
(with an obvious semantics).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could. We'll effectively be constrained to the 64MiB per tx for updated account data and 100MiB per block for newly allocated account data, so we already have some guard rails here, but we could be more explicit
1. Update the cost model to compute the aggregate size of all accounts marked | ||
writeable for each transaction. This can be done similarly to how | ||
`data_bytes_len_total` is computed inside `get_transaction_cost` but limited | ||
to accounts marked writeable. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is inaccurate.
Looking at the get_transaction_cost()
in cost-model
, I think, data_bytes_len_total
only calculates the size of the instruction data passed into instructions.
It does not look at the account sizes:
data_bytes_len_total =
data_bytes_len_total.saturating_add(instruction.data.len() as u64);
The loaded accounts data size is only included in the transaction cost after the transaction execution.
Actual value is passed into the calculate_cost_for_executed_transaction()
as the actual_loaded_accounts_data_size_bytes
argument.
And it is populated and checked in the blockstore_processor.rs
:
Some(CostModel::calculate_cost_for_executed_transaction(
tx,
committed_tx.executed_units,
committed_tx.loaded_account_stats.loaded_accounts_data_size,
&bank.feature_set,
))
We could record have a similar counter for the amount of writable data that the transaction touched.
So at the high level it can be computed similarly to what we are already doing.
It is just that the details would be different.
Not sure if there are any implications for the DX, due to the fact that we can only compute this value after a transaction execution.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, you're right. We don't have any precedent for understanding actual account data before loading/executing transactions. We rely on either scraping top level instructions (e.g. for allocate) or estimating and then adjusting.
This would seemingly have to fall into the estimate + adjust case (unless we want to do something radical like grab account data size before execution), which would likely impact DX if we push this onto the user similar to how we ask for CU/Data budget estimates.
One option is to leverage the existing Data budget value for writes as well (defaults to 2MiB per tx or can be set by user). I think this is likely small enough of a default to not bottleneck scheduling before the data budget gets refunded post execution. We would need to study this.
or if the slot needs to be marked dead (if sum is above the limit for the case | ||
of validator). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
andrew will have to look at this because if it can invalidate blocks then it may interfere with asynchronous execution
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good callout. I would naively expect this to be handled in the same manner that we are handling account data load limits, but someone with a bigger brain should confirm the details
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
its unfortunately quite different, transaction data size (sum of all account data sizes required for a transaction) can be calculated on the transaction level, and if the transaction exceeds its limit, it is treated as invalid. so in a world with async execution, the transaction could be included in a block before verifying it, but all nodes will see that it violates its limits and treat it the same (charge fees and not execute it, once the feature gate for fee-only transactions is active)
but if we added a data size limit at the block level, you would need to know the running total of all previous transactions to determine whether you can put the transaction in the block. theres no way to do this without executing because accounts can be created, deallocated, or reallocated. the purpose of removing constraints to enable async execution is that block producers which put "bad" transactions in a block might make suboptimal blocks, but they can always be sure the block is valid if they follow constraints that dont depend on execution (are transactions syntactically valid, are blockhashes valid, etc)
i believe we could have a block-level constraint as proposed here if instead of "if you exceed the block data size limit, the block is invalid" it was like "if you exceed the block data size limit, no further transactions in the block will be executed." this could be user-hostile tho, because it lets block producers invalidate transactions that should otherwise succeed. especially if we charge fees, the block producer now has an incentive to exceed the block limit and farm fees without having to execute transactions. this is different from a transaction exceeding its own data size because in that case the transaction is invalid as constructed on its own
alternatively we could very easily charge more for writable data at the transaction level, but im not sure that accomplishes the goals of this simd. andrew may have better suggestions than me tho
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"if you exceed the block data size limit, no further transactions in the block will be executed."
I like the direction of this, but I believe it leads to divergence, as transaction replay order is not guaranteed (iiuc). I tried to do something related a while ago, and wrote up my findings here: solana-labs/solana#27029
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the detailed explanation!
IIUC, the key difference between the strategy for loaded account data vs written account data is in how we know/estimate the amounts for a given tx.
We can know how much data a tx loads by either: (1) inspecting the tx and loading the accounts or (2) using the data budget requested / default data budget. The latter obviously is prone to under-packing, but with the current rates (48M CUs per block, default data budget of 2MiB, charging 8 CUs per 32kB), we could still pack >90k txs per block using default data budget.
For understanding the amount of data written, we either (1) actually execute the tx or (2) rely on some data written budget requested. Method 1 is a showstopper for async execution, so it seems like 2 is our only option. A default budget of 2MiB would allow only 1k default budget txs with a 2GiB write limit. Maybe we should do some profiling to see how low we could reasonably push this limit without impacting most existing txs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yep, think we are on the same page! the way transaction data size works now is theres a default budget (64MiB), adjustable by compute budget instruction, and at execution time the accounts are totalled and the transaction is invalid if it exceeds its limit. but this doesnt affect the block, so it doesnt interfere with async execution
we could add a transaction-level writable accounts budget with a smaller default limit (assume 2MiB like you suggest), and it would likewise be async-safe. we could easily have block-level constraints based on those requested budgets, which would also be async-safe. the main risk is people requesting enormous budgets and crowding other transactions out of the block, which we could defray with either a per-transaction maximum limit** or by exponentially ramping the cost per byte
if you make changes to the transaction data size calculation code, note that the formula is quite tricky. i have an open simd to fix this #186
**(16-20MiB would allow writing to a 10Mib account plus plenty of breathing room, but i dont actually know how big the largest accounts are onchain. 10MiB is the max allocation size but i believe you can pump accounts larger than that with realloc)
TL;DR - I think we would see essentially zero impact to mainnet behavior with a 2GiB write limit per block. We don't have existing metrics for writeable account data per slot, but I was able to collect some relevant data that should help this conversation. Data is looking back at metrics over the past 2+ weeks. We track the amount of account data stored per second ( I am observing writes typically in the 100MiB range with spikes as high as ~1.9GiB. We'll see spikes over 1GiB ~a few times per day. We also track newly allocated account data per block ( I also added some metrics to track the largest append vec writes. Most of these are in the 1-2MiB range, but we see spikes up to 400MiB on epoch boundary due to saving stake accounts. These spikes will come down significantly and be spread out when PER is enabled. I've also observed non-epoch-boundary spikes up to the ~20MiB range. @brooksprumo - It would be helpful if you or someone could fact check me here |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the idea of having block-wide limits. I attempted something similar a while ago, when (iirc) we did not, and thus had to hijack transaction errors to indicate block-wide issues. That model didn't work, and I wrote about it here: solana-labs/solana#27029. I've heard the TVU is smarter now, and can accommodate block-wide limit, which is great!
or if the slot needs to be marked dead (if sum is above the limit for the case | ||
of validator). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"if you exceed the block data size limit, no further transactions in the block will be executed."
I like the direction of this, but I believe it leads to divergence, as transaction replay order is not guaranteed (iiuc). I tried to do something related a while ago, and wrote up my findings here: solana-labs/solana#27029
block being produced + the amount of account data that would be written with | ||
the current transaction against the total block limit in the would_fit | ||
function | ||
3. Store the write limit in a constant such as |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Store the write limit in a constant
From past experience, I'd recommend against a constant that we intend to change later (or minimally want to support allowing to be changed). IMO a function that takes in the feature set and returns the value is more future-proof.
But, this is pretty agave-implementation specific, so may not be germane to a SIMD.
I worked on adding a metric for the amount of data stored by transaction processing, but it was not merged: solana-labs/solana#34718. Maybe we can bring this back, or use it to gather data more specific to transaction processing.
I don't have these numbers off hand; I'll look them up. |
No description provided.