Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

block writeable account data limit #184

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
107 changes: 107 additions & 0 deletions proposals/0184-block-writeable-account-data-limit.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,107 @@
---
simd: '0184'
title: Block Writeable Account Data Limit
authors:
- Brennan Watt (Anza)
category: Standard
type: Core
status: Idea
created: 2024-10-15
feature: (fill in with feature tracking issues once accepted)
supersedes:
superseded-by:
extends:
---

## Summary

A protocol limit on the amount of account data in bytes that may be written
during a single block.

## Motivation

When direct mapping is enabled, clients will be able to handle much more account
data because there will be fewer copies in memory (e.g. for each VM invocation /
CPI call). However, writeable account data cannot be direct mapped because it
needs to be written back after a successful tx execution. This necessitates a
decoupling between account read and write data in order to increase readable
account data limits.

Note: There is an existing per block limit on newly allocated account data
(100MB), but this does not encompass account data being updated/written back.

## New Terminology

None

## Detailed Design

This feature would check the amount of account data marked as writeable for each
transaction being handled by the block producer (for inclusion in a block) or
validator (replaying transactions in a block), add it to a running total of
total writeable account data for the block, and compare it against some
threshold (recommend 2GB to start) to decide if the transaction is allowed to be

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have statistics on what is the current distribution of writable data on the mainnet?
I wonder if we could start with an even lower value.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't have perfect stats, but I posted comment with some analysis below (#184 (comment))

2GiB feels right as a starting point. It should constrain worst case reasonably well without impacting current typical behavior on mainnet. We could ratchet things down (or up) after observing behavior

included in the block (if sum is below the limit for the case of block producer)
or if the slot needs to be marked dead (if sum is above the limit for the case
of validator).
Comment on lines +45 to +46
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

andrew will have to look at this because if it can invalidate blocks then it may interfere with asynchronous execution

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good callout. I would naively expect this to be handled in the same manner that we are handling account data load limits, but someone with a bigger brain should confirm the details

Copy link
Member

@2501babe 2501babe Oct 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

its unfortunately quite different, transaction data size (sum of all account data sizes required for a transaction) can be calculated on the transaction level, and if the transaction exceeds its limit, it is treated as invalid. so in a world with async execution, the transaction could be included in a block before verifying it, but all nodes will see that it violates its limits and treat it the same (charge fees and not execute it, once the feature gate for fee-only transactions is active)

but if we added a data size limit at the block level, you would need to know the running total of all previous transactions to determine whether you can put the transaction in the block. theres no way to do this without executing because accounts can be created, deallocated, or reallocated. the purpose of removing constraints to enable async execution is that block producers which put "bad" transactions in a block might make suboptimal blocks, but they can always be sure the block is valid if they follow constraints that dont depend on execution (are transactions syntactically valid, are blockhashes valid, etc)

i believe we could have a block-level constraint as proposed here if instead of "if you exceed the block data size limit, the block is invalid" it was like "if you exceed the block data size limit, no further transactions in the block will be executed." this could be user-hostile tho, because it lets block producers invalidate transactions that should otherwise succeed. especially if we charge fees, the block producer now has an incentive to exceed the block limit and farm fees without having to execute transactions. this is different from a transaction exceeding its own data size because in that case the transaction is invalid as constructed on its own

alternatively we could very easily charge more for writable data at the transaction level, but im not sure that accomplishes the goals of this simd. andrew may have better suggestions than me tho

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"if you exceed the block data size limit, no further transactions in the block will be executed."

I like the direction of this, but I believe it leads to divergence, as transaction replay order is not guaranteed (iiuc). I tried to do something related a while ago, and wrote up my findings here: solana-labs/solana#27029

Copy link
Author

@bw-solana bw-solana Oct 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the detailed explanation!

IIUC, the key difference between the strategy for loaded account data vs written account data is in how we know/estimate the amounts for a given tx.

We can know how much data a tx loads by either: (1) inspecting the tx and loading the accounts or (2) using the data budget requested / default data budget. The latter obviously is prone to under-packing, but with the current rates (48M CUs per block, default data budget of 2MiB, charging 8 CUs per 32kB), we could still pack >90k txs per block using default data budget.

For understanding the amount of data written, we either (1) actually execute the tx or (2) rely on some data written budget requested. Method 1 is a showstopper for async execution, so it seems like 2 is our only option. A default budget of 2MiB would allow only 1k default budget txs with a 2GiB write limit. Maybe we should do some profiling to see how low we could reasonably push this limit without impacting most existing txs.

Copy link
Member

@2501babe 2501babe Oct 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep, think we are on the same page! the way transaction data size works now is theres a default budget (64MiB), adjustable by compute budget instruction, and at execution time the accounts are totalled and the transaction is invalid if it exceeds its limit. but this doesnt affect the block, so it doesnt interfere with async execution

we could add a transaction-level writable accounts budget with a smaller default limit (assume 2MiB like you suggest), and it would likewise be async-safe. we could easily have block-level constraints based on those requested budgets, which would also be async-safe. the main risk is people requesting enormous budgets and crowding other transactions out of the block, which we could defray with either a per-transaction maximum limit** or by exponentially ramping the cost per byte

if you make changes to the transaction data size calculation code, note that the formula is quite tricky. i have an open simd to fix this #186

**(16-20MiB would allow writing to a 10Mib account plus plenty of breathing room, but i dont actually know how big the largest accounts are onchain. 10MiB is the max allocation size but i believe you can pump accounts larger than that with realloc)


This feature to limit writeable account data per block would slot in nicely to
the existing cost model and cost tracker that maintain per block limits for
various resources transactions can consume. This will allow block producers to
determine when a block is "full" and validators to mark slots dead that do not
adhere to these limits.

This feature could be implemented like so (using Agave client as reference):

1. Update the cost model to compute the aggregate size of all accounts marked
writeable for each transaction. This can be done similarly to how
`data_bytes_len_total` is computed inside `get_transaction_cost` but limited
to accounts marked writeable.
Comment on lines +56 to +59

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is inaccurate.

Looking at the get_transaction_cost() in cost-model, I think, data_bytes_len_total only calculates the size of the instruction data passed into instructions.
It does not look at the account sizes:

            data_bytes_len_total =
                data_bytes_len_total.saturating_add(instruction.data.len() as u64);

https://github.com/anza-xyz/agave/blob/f9f8b60ca15fa721c6cdd816c99dfd4e9123fd77/cost-model/src/cost_model.rs#L187-L188

The loaded accounts data size is only included in the transaction cost after the transaction execution.
Actual value is passed into the calculate_cost_for_executed_transaction() as the actual_loaded_accounts_data_size_bytes argument.
And it is populated and checked in the blockstore_processor.rs:

                Some(CostModel::calculate_cost_for_executed_transaction(
                    tx,
                    committed_tx.executed_units,
                    committed_tx.loaded_account_stats.loaded_accounts_data_size,
                    &bank.feature_set,
                ))

https://github.com/anza-xyz/agave/blob/f9f8b60ca15fa721c6cdd816c99dfd4e9123fd77/ledger/src/blockstore_processor.rs#L246-L251

We could record have a similar counter for the amount of writable data that the transaction touched.

So at the high level it can be computed similarly to what we are already doing.
It is just that the details would be different.

Not sure if there are any implications for the DX, due to the fact that we can only compute this value after a transaction execution.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, you're right. We don't have any precedent for understanding actual account data before loading/executing transactions. We rely on either scraping top level instructions (e.g. for allocate) or estimating and then adjusting.

This would seemingly have to fall into the estimate + adjust case (unless we want to do something radical like grab account data size before execution), which would likely impact DX if we push this onto the user similar to how we ask for CU/Data budget estimates.

One option is to leverage the existing Data budget value for writes as well (defaults to 2MiB per tx or can be set by user). I think this is likely small enough of a default to not bottleneck scheduling before the data budget gets refunded post execution. We would need to study this.

2. Update the cost tracker to check the current writeable account data in the
block being produced + the amount of account data that would be written with
the current transaction against the total block limit in the would_fit
function
3. Store the write limit in a constant such as
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Store the write limit in a constant

From past experience, I'd recommend against a constant that we intend to change later (or minimally want to support allowing to be changed). IMO a function that takes in the feature set and returns the value is more future-proof.

But, this is pretty agave-implementation specific, so may not be germane to a SIMD.

`MAX_BLOCK_ACCOUNTS_DATA_SIZE_WRITEABLE` and set this to `2_000_000_000`
(2GB).
Comment on lines +65 to +66

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2GB is 2_147_483_648 :P

Copy link
Member

@2501babe 2501babe Oct 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
`MAX_BLOCK_ACCOUNTS_DATA_SIZE_WRITEABLE` and set this to `2_000_000_000`
(2GB).
`MAX_BLOCK_ACCOUNTS_DATA_SIZE_WRITEABLE` and set this to 2GiB
(`2 * 1024 * 1024 * 1024`).

4. Add a new `CostTrackerError` type `WouldExceedAccountDataWriteableBlockLimit`
and return this error if block limit is exceeded.
5. Treat this new error type (`WouldExceedAccountDataWriteableBlockLimit`) as
retryable in execute_and_commit_transactions_locked
Comment on lines +69 to +70

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
5. Treat this new error type (`WouldExceedAccountDataWriteableBlockLimit`) as
retryable in execute_and_commit_transactions_locked
5. Treat this new error type (`WouldExceedAccountDataWriteableBlockLimit`) as
retryable in `execute_and_commit_transactions_locked()`.


There is an edge case where if a single transaction tries to consume more than
the entire per block writeable account data budget, it should not be marked as
retryable. This might not be possible today given tx size limits and maximum
account size limits, but this could be exploitable in the future if either of
this constraints change.
Comment on lines +72 to +76

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to introduce a separate constraint that limits the amount of writable data a single transaction can access?
We have MAX_LOADED_ACCOUNTS_DATA_SIZE_BYTES equal to 64MB which limits the total amount of data a transaction can load.
We could add MAX_LOADED_ACCOUNTS_WRITEABLE_DATA_SIZE_BYTES (with an obvious semantics).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could. We'll effectively be constrained to the 64MiB per tx for updated account data and 100MiB per block for newly allocated account data, so we already have some guard rails here, but we could be more explicit


## Alternatives Considered

Keep read/write data limits coupled even after direct mapping. Downside of this
is artificially constraining the amount of readable account data per block.

Unthrottle write data without any limit. This could negatively impact slot
replay times for validators with slow disks.

## Impact

Read and write account data limits per block can be increased independently of
each other.

## Security Considerations

Agave and FD (and any other) clients should implement this proposal to avoid
breaking consensus.

## Drawbacks

If a large amount of account data is attempted to be updated in a given block,
transactions will start being excluded and marked as retryable where they can
attempt to be included in the next block.

## Backwards Compatibility

With this new threshold, it is possible there are blocks that would have been
valid that are now marked dead due to exceeding the write threshold. This change
will need to be feature gated to ensure all validators change behavior in lock
step.
Loading