eth/fetcher: modify queue limits for improving sync near chain tip #1260
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
This PR updates few constants used in block fetcher to suit PoS block timings i.e. 2s as compared to ethereum which is 12s.
Rationale
Block fetcher handles downloading of block headers and bodies near the chain tip as it acts upon the block and block hash announcements. A constant
maxQueueDist
is used to queue the block announcements and it's value was initially set to 32. This means that it only stores the block announcements of last 32 blocks from the chain tip. Rest are dropped with logPeer discarded announcement
.While debugging the sync issues observed recently, we identified that old block announcements were dropped while newer ones were included. This used to stall the import as we would have some missing blocks in this case. It can only proceed in 2 cases.
E.g. Let's say the chain tip is at 100. Bor received block announcements till 132 and it's about to process these blocks. At the same time, block announcement for 133 arrives and is dropped because the different with chain tip is > 32. More announcements can also be dropped if the original import takes more time. For simplicity, let's say the import of 100-132 finished and chain head is now 132. Bor can now accept new block announcements i.e. 134, 135, ... and so on. But, because 133 was dropped, it doesn't have any context about that block and it keeps waiting despite having more blocks lined up. There's no clear way to proceed at this point.
The simplest option for now is to increase the queue limit to store more block announcements and not drop them. The ideal solution is to use a value which is 6x the original value because of the 6x block time.
Changes
Checklist
Manual tests
Tested sync on an internal mainnet node