-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
InvalidFeePayerFilter #32939
InvalidFeePayerFilter #32939
Conversation
4dfebf1
to
d106c68
Compare
Codecov Report
@@ Coverage Diff @@
## master #32939 +/- ##
========================================
Coverage 82.0% 82.0%
========================================
Files 796 797 +1
Lines 215727 215926 +199
========================================
+ Hits 176897 177100 +203
+ Misses 38830 38826 -4 |
I wrote a benchmark to test this code against master. The bench is a bit hacky, but I filled a single-thread of banking-stage buffer half with prioritized spam (unable to pay fees) and half with valid transactions. Benched how long it took to get through all transactions. After observing the first spam transaction, the rest should be more quickly filtered out by the MI batch selection:
bench is a bit flaky, with a sigsev if I try to join my poh_service...so not committing the code for bench quite yet. |
TransactionError::AccountNotFound | ||
| TransactionError::InsufficientFundsForFee | ||
| TransactionError::InvalidAccountForFee => { | ||
self.invalid_fee_payer_filter.add(*tx.message().fee_payer()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here is the only place we add to our filter currently.
These 3 errors:
AccountNotFound
- could not find fee-payer during loadInsufficientFundsForFee
- fee-payer exists, but doesn't have enough lamportsInvalidAccountForFee
- this account can't pay fees
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
won't AccountNotFound
get thrown for accounts besides the fee-payer?
:my_eyes_are_bleeding:
solana/accounts-db/src/accounts.rs
Lines 493 to 497 in dbe4017
if !validated_fee_payer { | |
error_counters.account_not_found += 1; | |
return Err(TransactionError::AccountNotFound); | |
} | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that was the only instance😭
Maybe it's better to swap that error with InsufficientFunds or rename it to FeePayerAccoutNotFound so we don't lose the distinction
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that was the only instance😭 Maybe it's better to swap that error with InsufficientFunds or rename it to FeePayerAccoutNotFound so we don't lose the distinction
could. separate pr tho
@@ -285,6 +287,12 @@ impl LatestUnprocessedVotes { | |||
bank.as_ref(), | |||
) | |||
{ | |||
if invalid_fee_payer_filter |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Check vote transactions for invalid fee-payers prior to forwarding.
let original_length = deserialized_packets.len(); | ||
let filtered_packets = deserialized_packets | ||
.into_iter() | ||
.filter(|packet| { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove any known invalid-fee-payer when receiving from sigverify into banking stage.
This prevents them from taking up buffer space
let message = &packet.transaction().get_message().message; | ||
if invalid_fee_payer_filter.should_reject(&message.static_account_keys()[0]) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Check for known invalid fee-payers during MI batch selection. Don't even attempt to load and execute these transactions.
Plan is to eventually also check txs prior to signature verification in SigVerifyStage so we don't even waste resources verifying them. That's a bit more involved, and think it's a good for a separate PR. |
core/src/invalid_fee_payer_filter.rs
Outdated
} | ||
|
||
/// Simple wrapper thread that periodically resets the filter. | ||
pub struct InvalidFeePayerFilterThread { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm re-thinking after I posted the PR. I probably don't really need a separate thread for this at all.
- Just add an atomic interval on the
InvalidFeePayerFilter
- Banking threads just call some fn that resets & reports if atomic interval should report
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me, filtering txs couldn't/won't pay at banking_stage's receiving, processing and forwarding points are thoughtful process to improve banking qos. Even it'd be defenseless if spammer continuously rotating invalid payer accounts, this change, imo, are still good.
Will take another round of review later
|
||
#[derive(Default)] | ||
pub struct InvalidFeePayerFilter { | ||
recent_invalid_fee_payers: DashSet<Pubkey>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there's resource exhaustion attack potential with retaining the pubkeys here. consider Deduper
as a high-performance, fixed-size, statistical accounting mechanism
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Had that in mind as a follow up, hence why I separated a class.
I can just do in this PR tho
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looked into this a bit and could be wrong. It looks to me reseting or clearing the deduper/bloom filter would require mutability; so we'd need some sort of locking mechanism. Since this filter is shared by multiple threads, that would block other threads which is something I'm trying to avoid.
Might be better to just vacate LRU if we would exceed some capacity. That would protect us from OOMing, but potentially slow things down if we exceed capacity. vs periodic locking of something like deduper. wdyt?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
isn't deduper all atomics?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not for resetting:
pub fn maybe_reset<R: Rng>(
&mut self,
...
because the underlying bits
is stored in a Vec
: bits: Vec<AtomicU64>,
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@t-nelson are you content with this decision to just randomly evict if we see too many invalid fee payers?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
performance in the bench i was running (1 very active invalid fee-payer) showed no difference between bloom and dash map internally.
seems like bloom should be preferable in that case due to constant size?
@t-nelson are you content with this decision to just randomly evict if we see too many invalid fee payers?
seems fine, yeah
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seems like bloom should be preferable in that case due to constant size?
thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a pre-allocated dashmap also has a constant size, although I think larger than the double-bloom filter for a similar number of items.
Since there doesn't seem to be any time performance benefit, I was sticking with the simpler implementation, instead of maintaining 2 bloom-filters to allow us to "atomically" clear.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@t-nelson not sure if you saw the reply to your "thoughts?"
.iter() | ||
.map(|tx| { | ||
if invalid_fee_payer_filter.should_reject(&tx.message().account_keys()[0]) { | ||
Err(TransactionError::InsufficientFundsForFee) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
InvalidAccountForFee
would have less potential to mislead as an aggregate error here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was about to suggest to use a new TransactionError specifically to indicate it's failed because the payer account is black listed.
Mainly thinking from UX perspective; sophisticated abusers will soon find way to get around this filter, the 2 sec blackout would probably apply more to innocent mistakes, a clear error might help in that case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unrecorded transaction errors do not get reported back to the users though. It'd purely be an internal metric - similar to our AccountInUse
error metrics.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TransactionError::AccountNotFound | ||
| TransactionError::InsufficientFundsForFee | ||
| TransactionError::InvalidAccountForFee => { | ||
self.invalid_fee_payer_filter.add(*tx.message().fee_payer()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
won't AccountNotFound
get thrown for accounts besides the fee-payer?
:my_eyes_are_bleeding:
solana/accounts-db/src/accounts.rs
Lines 493 to 497 in dbe4017
if !validated_fee_payer { | |
error_counters.account_not_found += 1; | |
return Err(TransactionError::AccountNotFound); | |
} | |
c3a814a
to
99ef4fd
Compare
|
||
#[derive(Default)] | ||
pub struct InvalidFeePayerFilter { | ||
recent_invalid_fee_payers: DashSet<Pubkey>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
performance in the bench i was running (1 very active invalid fee-payer) showed no difference between bloom and dash map internally.
seems like bloom should be preferable in that case due to constant size?
@t-nelson are you content with this decision to just randomly evict if we see too many invalid fee payers?
seems fine, yeah
.iter() | ||
.map(|tx| { | ||
if invalid_fee_payer_filter.should_reject(&tx.message().account_keys()[0]) { | ||
// Actual error variant is not used - Result is only used to skip additional |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
core eng will use the error variant for debugging. conflating this case with others may mislead those efforts
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's kind of why I added the comment here. It makes it clear that these will not show up in the regular error metrics.
The only way someone finds this is if they search for the error variant, in which case they then see this comment.
I changed it to the more appropriate variant InvalidAccountForFee
- unless we want to create a new error variant not sure what action to take here.
TransactionError::AccountNotFound | ||
| TransactionError::InsufficientFundsForFee | ||
| TransactionError::InvalidAccountForFee => { | ||
self.invalid_fee_payer_filter.add(*tx.message().fee_payer()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that was the only instance😭 Maybe it's better to swap that error with InsufficientFunds or rename it to FeePayerAccoutNotFound so we don't lose the distinction
could. separate pr tho
@@ -50,7 +56,7 @@ impl PacketReceiver { | |||
recv_timeout, | |||
unprocessed_transaction_storage.max_receive_size(), | |||
) | |||
// Consumes results if Ok, otherwise we keep the Err | |||
// Consumes results if Ok, otherwise we keep the Err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: typo?
|
||
#[derive(Default)] | ||
pub struct InvalidFeePayerFilter { | ||
recent_invalid_fee_payers: DashSet<Pubkey>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seems like bloom should be preferable in that case due to constant size?
thoughts?
2fef39b
to
afe8dd8
Compare
Problem
Some transactions that cannot pay fees make it into banking stage, often with repeat offenders. These transactions take time to process which can cause fewer valid transactions to make it into the block.
Summary of Changes
InvalidFeePayerFilter
InvalidFeePayerFilter
InvalidFeePayerFilter
during batch selectionInvalidFeePayerFilter
before forwardingBankingStage
checksInvalidFeePayerFilter
when receiving packets fromSigVerify
BankingStage
will periodically clear the filterFixes #