Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ReplayStage Panic on Restart #25716

Closed
buffalu opened this issue Jun 1, 2022 · 2 comments
Closed

ReplayStage Panic on Restart #25716

buffalu opened this issue Jun 1, 2022 · 2 comments

Comments

@buffalu
Copy link
Contributor

buffalu commented Jun 1, 2022

Problem

Trying to restart today and run into: restart_crash.txt

thread 'solana-replay-stage' panicked at 'assertion failed: !new_storage_location.is_cached()', runtime/src/accounts_db.rs:3632:17
stack backtrace:
   0: rust_begin_unwind
             at ./rustc/9d1b2106e23b1abd32fce1f17267604a5102f57a/library/std/src/panicking.rs:498:5
   1: core::panicking::panic_fmt
             at ./rustc/9d1b2106e23b1abd32fce1f17267604a5102f57a/library/core/src/panicking.rs:116:14
   2: core::panicking::panic
             at ./rustc/9d1b2106e23b1abd32fce1f17267604a5102f57a/library/core/src/panicking.rs:48:5
   3: solana_runtime::accounts_db::AccountsDb::retry_to_get_account_accessor
   4: solana_runtime::accounts_db::AccountsDb::load
   5: solana_runtime::accounts::Accounts::load_slow
   6: solana_program_runtime::sysvar_cache::SysvarCache::fill_missing_entries
   7: solana_runtime::bank::sysvar_cache::<impl solana_runtime::bank::Bank>::fill_missing_sysvar_cache_entries
   8: solana_runtime::bank::Bank::_new_from_parent
   9: solana_runtime::bank::Bank::new_from_parent_with_options
  10: solana_core::replay_stage::ReplayStage::generate_new_bank_forks
  11: solana_core::replay_stage::ReplayStage::new::{{closure}}
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
[2022-06-01T20:24:18.687904257Z ERROR solana_metrics::metrics] datapoint: panic program="validator" thread="solana-replay-stage" one=1i message="panicked at 'assertion failed: !new_storage_location.is_cached()', runtime/src/accounts_db.rs:3632:17" location="runtime/src/accounts_db.rs:3632:17" version="\"1.10.23 (src:777c66ce; feat:1054763811)\""

Proposed Solution

@jstarry
Copy link
Member

jstarry commented Jun 3, 2022

Which cluster and which validator version were you using?

@steviez
Copy link
Contributor

steviez commented Oct 11, 2022

So we have seen and fixed several of the same panics in v1.14 recently. Two relevant log lines from what you provided:

[2022-06-01T20:24:18.612941132Z INFO  solana_core::replay_stage] new fork:135986384 parent:135986379 root:135986379

[2022-06-01T20:24:18.613045169Z ERROR solana_runtime::accounts_db] set_hash: already exists; multiple forks with shared slot 135986384 as child (parent: 135986379)!?
  • The second log line indicates creating a bank for the same slot twice; we know this will cause a panic
  • The first log line I showed indicated that 135986384 was created with parent 135986379.

However, 384 was actually the child of 383 as shown here. So, you seemingly had a bad version of this block.

More so, 135986379 was the highest optimistically confirmed slot that was picked and used as the snapshot restart slot. Your timestamp for creating that line was 20:24 UTC; however, this status page indicates that block production didn't resume until 21:00 UTC. Thus, I don't think you should have been able to replay anything > 135986379 until block production resumed.

So, it is my hypothesis that your node had a version of 384 that was marked dead, and then ran into same behavior as outlined in #28343

I doubt you still have the full logs 🤣 so I don't think we can be 100% certain, but in any case, going to close this issue

@steviez steviez closed this as completed Oct 11, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants