-
Notifications
You must be signed in to change notification settings - Fork 248
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Move more logic in blockstore_processor behind allow_dead_slots flag #2341
Conversation
A flag in ProcessOptions can be set to allow dead slots. If the flag is not set and a block is marked dead, fetching the entries will fail as the entry fetch method internally checks if the slot is dead. Within process_next_slots(), new Banks are created as replay progresses through slots. If allow_dead_slots=false and a dead slot is loaded, replay of the slot will error. That error is handled and results in the Bank being removed from BankForks. Instead of allowing the extra work to proceed, check if new slots to replay are dead (and allow_dead_slots=false) BEFORE the creation of a new Bank.
1e76423
to
2dc1b2d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@@ -1725,18 +1725,21 @@ fn process_next_slots( | |||
blockstore: &Blockstore, | |||
leader_schedule_cache: &LeaderScheduleCache, | |||
pending_slots: &mut Vec<(SlotMeta, Bank, Hash)>, | |||
halt_at_slot: Option<Slot>, | |||
opts: &ProcessOptions, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
don't love the name, but I know you're just maintaining the existing convention, so I'll hold my nose
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
With this change, do we feel confident to upgrade the set_hash
to a panic? i.e. running solana-labs#33186 on master/inv net for a while and see what happens?
Potentially, let me think about it a little and will followup with you. Per this Discord message, Brooks had a similar idea at one point but mentioned seeing a CI failure. Obviously, if we have a test that hits this, that would be great to fix, but for a live cluster, there is still more unknown |
Problem
A flag in ProcessOptions can be set to allow dead slots. If the flag is
not set and a block is marked dead, fetching the entries will fail as
the blockstore method to retrive entries checks if the slot is dead.
Within process_next_slots(), new Banks are created as children of
already replayed banks are discovered. The children Banks are created
prior to fetching entries, and thus, a Bank could be created for a
dead slot that will eventually be discarded.
Summary of Changes
Instead of allowing the extra work of creating a Bank to proceed, check
if a slot is dead (and allow_dead_slots=false) BEFORE creating a Bank
for the slot.
For the case of a validator,
allow_dead_slots=false
for the local ledger replay at startup. So, we will avoid processing any known dead slots altogether. On the other hand,ReplayStage
will create aBank
for dead slots, see it is dead and then proceed on other forks. By avoiding the creation ofBank
in local ledger replay, we'll avoid this non-fatal error:Fixes solana-labs#28343