Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

enha: optimize importer offline #1877

Merged
merged 7 commits into from
Nov 24, 2024
Merged

Conversation

marcospb19-cw
Copy link
Contributor

@marcospb19-cw marcospb19-cw commented Nov 19, 2024

PR Type

Enhancement, Performance


Description

  • Refactored the importer offline process for improved parallelism:
    • Separated tasks for fetching, executing, and saving blocks
    • Implemented batch processing for block execution and saving
    • Introduced new constants for channel capacities and batch sizes
  • Added batch saving capability to PermanentStorage, RocksPermanentStorage, and StratusStorage
  • Implemented a sync write option for RocksDB, controlled by a new configuration parameter
  • Optimized inmemory temporary storage operations
  • Removed the rocks_batch_writer module and its implementation
  • Updated RocksStorageState to include the sync write option and batch saving functionality
  • Added new configuration options for RocksDB in the importer-offline environment file
  • Improved error handling and logging throughout the affected modules

Changes walkthrough 📝

Relevant files
Configuration changes
2 files
historic_events_processor.rs
Update RocksStorageState initialization                                   

src/bin/historic_events_processor.rs

  • Updated RocksStorageState::new() call to include a new boolean
    parameter
  • +1/-1     
    importer-offline.env.local
    Update importer-offline configuration                                       

    config/importer-offline.env.local

  • Added ROCKS_DISABLE_SYNC_WRITE=true
  • Added ROCKS_CACHE_SIZE_MULTIPLIER=0.1
  • +3/-0     
    Enhancement
    8 files
    importer_offline.rs
    Refactor importer offline for improved parallelism             

    src/bin/importer_offline.rs

  • Refactored the importer offline process to use separate tasks for
    fetching, executing, and saving blocks
  • Introduced new constants for channel capacities and batch sizes
  • Implemented batch processing for block execution and saving
  • Updated error handling and logging
  • +167/-114
    inmemory_temporary.rs
    Optimize inmemory temporary storage                                           

    src/eth/storage/inmemory/inmemory_temporary.rs

  • Removed unused imports
  • Optimized do_read_slot function using find_map
  • Minor code cleanup and refactoring
  • +18/-38 
    permanent_storage.rs
    Add batch saving capability to PermanentStorage                   

    src/eth/storage/permanent_storage.rs

  • Added save_block_batch method to PermanentStorage trait
  • Introduced rocks_disable_sync_write configuration option
  • +11/-1   
    mod.rs
    Remove rocks_batch_writer module                                                 

    src/eth/storage/rocks/mod.rs

    • Removed rocks_batch_writer module
    +0/-3     
    rocks_batch_writer.rs
    Remove rocks_batch_writer implementation                                 

    src/eth/storage/rocks/rocks_batch_writer.rs

    • Entire file removed
    +0/-78   
    rocks_permanent.rs
    Add batch saving to RocksPermanentStorage                               

    src/eth/storage/rocks/rocks_permanent.rs

  • Updated RocksPermanentStorage::new() to include enable_sync_write
    parameter
  • Implemented save_block_batch method
  • +13/-2   
    rocks_state.rs
    Implement batch saving and sync write option                         

    src/eth/storage/rocks/rocks_state.rs

  • Added enable_sync_write field to RocksStorageState
  • Implemented save_block_batch method
  • Refactored save_block to use prepare_block_insertion
  • Updated write_in_batch_for_multiple_cfs to use enable_sync_write
    option
  • +43/-14 
    stratus_storage.rs
    Add batch saving capability to StratusStorage                       

    src/eth/storage/stratus_storage.rs

  • Implemented save_block_batch method for StratusStorage
  • Added checks for block number consistency in batch saving
  • +49/-0   
    Formatting
    1 files
    mod.rs
    Reorder storage module imports                                                     

    src/eth/storage/mod.rs

    • Reordered module imports
    +1/-2     

    💡 PR-Agent usage: Comment /help "your question" on any pull request to receive relevant information

    Copy link

    github-actions bot commented Nov 19, 2024

    PR Reviewer Guide 🔍

    (Review updated until commit b06e158)

    Here are some key observations to aid the review process:

    ⏱️ Estimated effort to review: 4 🔵🔵🔵🔵⚪
    🧪 No relevant tests
    🔒 No security concerns identified
    ⚡ Recommended focus areas for review

    Performance Optimization
    The PR introduces significant changes to optimize the importer offline process. The new implementation uses separate tasks for fetching, executing, and saving blocks, with batch processing. This should be carefully reviewed to ensure it maintains correctness while improving performance.

    Data Integrity
    The PR introduces a new configuration option to disable sync writes in RocksDB. This could potentially lead to data loss in case of system failure. The trade-offs and implications of this change should be carefully considered.

    Error Handling
    The new save_block_batch function includes multiple checks for block consistency. The error handling and logging in this function should be thoroughly reviewed to ensure all edge cases are properly handled.

    Copy link

    github-actions bot commented Nov 19, 2024

    PR Code Suggestions ✨

    Latest suggestions up to b06e158
    Explore these optional code suggestions:

    CategorySuggestion                                                                                                                                    Score
    Possible issue
    Implement atomic batch saving to ensure data consistency when saving multiple blocks

    In the save_block_batch function, consider adding a transaction or atomic operation
    mechanism to ensure that either all blocks in the batch are saved successfully, or
    none are saved if an error occurs during the process.

    src/eth/storage/stratus_storage.rs [426]

    -self.perm.save_block_batch(blocks).map_err(Into::into)
    +self.perm.transaction(|tx| {
    +    tx.save_block_batch(blocks)?;
    +    Ok(())
    +}).map_err(|e| {
    +    tracing::error!(reason = ?e, "Failed to save block batch atomically");
    +    StratusError::from(e)
    +})
    Suggestion importance[1-10]: 8

    Why: This suggestion significantly improves data consistency by implementing atomic batch saving. It ensures that either all blocks in a batch are saved or none are, which is crucial for maintaining database integrity during batch operations.

    8
    Add a warning log when sync writes are disabled to alert about potential data loss risks

    Ensure that the enable_sync_write flag is properly handled throughout the codebase,
    especially in critical write operations. Consider adding a warning log when sync
    writes are disabled, as this could lead to potential data loss in case of system
    failure.

    src/eth/storage/rocks/rocks_state.rs [484]

     options.set_sync(self.enable_sync_write);
    +if !self.enable_sync_write {
    +    tracing::warn!("Sync writes are disabled. This may lead to data loss in case of system failure.");
    +}
    Suggestion importance[1-10]: 7

    Why: This suggestion improves system observability by adding a warning log when sync writes are disabled, which is crucial for alerting operators about potential data loss risks. It's a valuable addition to the existing code.

    7
    Improve error handling in the save_block_batch function to ensure proper logging and potential rollback of failed operations

    Implement error handling for the save_block_batch function to ensure that if an
    error occurs during the batch save, it's properly logged and the operation is rolled
    back if necessary.

    src/eth/storage/rocks/rocks_permanent.rs [139-143]

     fn save_block_batch(&self, block_batch: Vec<Block>) -> anyhow::Result<()> {
    -    self.state.save_block_batch(block_batch).inspect_err(|e| {
    +    self.state.save_block_batch(block_batch).map_err(|e| {
             tracing::error!(reason = ?e, "failed to save block_batch in RocksPermanent");
    +        anyhow::anyhow!("Failed to save block batch: {}", e)
         })
     }
    Suggestion importance[1-10]: 6

    Why: The suggestion enhances error handling by using map_err instead of inspect_err, allowing for more detailed error reporting and potential error transformation. This improvement aids in debugging and error management.

    6

    Previous suggestions

    Suggestions up to commit cd52c91
    CategorySuggestion                                                                                                                                    Score
    General
    Simplify code by using functional programming constructs for more concise and readable implementations

    Simplify the do_read_slot function by using find_map instead of iterating manually
    over the states.

    src/eth/storage/inmemory/inmemory_temporary.rs [267-279]

     fn do_read_slot(states: &NonEmpty<InMemoryTemporaryStorageState>, address: Address, index: SlotIndex) -> Option<Slot> {
    -    // search all
    -    for state in states.iter() {
    -        let Some(account) = state.accounts.get(&address) else { continue };
    -        let Some(&slot) = account.slots.get(&index) else { continue };
    +    let slot = states
    +        .iter()
    +        .find_map(|state| state.accounts.get(&address).and_then(|account| account.slots.get(&index)));
     
    +    if let Some(&slot) = slot {
             tracing::trace!(%address, %index, %slot, "slot found in temporary");
    -        return Some(slot);
    +        Some(slot)
    +    } else {
    +        tracing::trace!(%address, %index, "slot not found in temporary");
    +        None
         }
    -
    -    // not found
    -    tracing::trace!(%address, %index, "slot not found in temporary");
    -    None
     }
    Suggestion importance[1-10]: 9

    Why: The suggestion significantly improves code readability and maintainability by using functional programming constructs. It reduces the amount of code while maintaining the same functionality, making it easier to understand and less prone to errors.

    9
    Avoid unnecessary dereferencing when handling optional references

    Use if let Some(nonce) = change.nonce.take_ref() instead of if let Some(&nonce) =
    change.nonce.take_ref() to avoid unnecessary dereferencing.

    src/eth/storage/inmemory/inmemory_temporary.rs [155-157]

    -if let Some(&nonce) = change.nonce.take_ref() {
    -    account.info.nonce = nonce;
    +if let Some(nonce) = change.nonce.take_ref() {
    +    account.info.nonce = *nonce;
     }
    Suggestion importance[1-10]: 8

    Why: The suggestion improves code clarity and efficiency by avoiding unnecessary dereferencing. It aligns with Rust's idiomatic practices and reduces the risk of potential errors.

    8

    @marcospb19-cw marcospb19-cw force-pushed the enha-optimize-importer-offline branch from 33743c7 to deb42d7 Compare November 22, 2024 20:13
    @marcospb19-cw marcospb19-cw force-pushed the enha-optimize-importer-offline branch from deb42d7 to b06e158 Compare November 22, 2024 20:13
    @marcospb19-cw marcospb19-cw marked this pull request as ready for review November 23, 2024 23:27
    Copy link

    Persistent review updated to latest commit b06e158

    @marcospb19-cw marcospb19-cw force-pushed the enha-optimize-importer-offline branch from b06e158 to 72b8a96 Compare November 23, 2024 23:31
    @marcospb19-cw marcospb19-cw merged commit 83f662e into main Nov 24, 2024
    33 checks passed
    @marcospb19-cw marcospb19-cw deleted the enha-optimize-importer-offline branch November 24, 2024 00:13
    @gabriel-aranha-cw
    Copy link
    Contributor

    Final benchmark:
    RPS Stats: Max: 1167.00, Min: 671.00, Avg: 907.10, StdDev: 63.56
    TPS Stats: Max: 1047.00, Min: 721.00, Avg: 879.36, StdDev: 66.08
    Plot: View Plot

    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    None yet
    Projects
    None yet
    Development

    Successfully merging this pull request may close these issues.

    2 participants