Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

enha: rework rocksdb configuration #1871

Merged
merged 65 commits into from
Dec 17, 2024
Merged

Conversation

carneiro-cw
Copy link
Contributor

@carneiro-cw carneiro-cw commented Nov 13, 2024

PR Type

Enhancement


Description

  • Optimized RocksDB configuration for point lookups, replacing previous SST-focused configs
  • Removed logs-related functionality and column family
  • Updated column family configurations with new cache settings and prefix lengths
  • Exposed rocks_state and rocks_db modules
  • Implemented prefix extractor and bloom filter settings for improved performance
  • Removed metrics export from block saving process
  • Added 'serde' feature to revm dependency for serialization support

Changes walkthrough 📝

Relevant files
Enhancement
mod.rs
Expose RocksDB modules                                                                     

src/eth/storage/permanent/rocks/mod.rs

  • Made rocks_state and rocks_db modules public
+2/-2     
rocks_config.rs
Optimize RocksDB configuration for point lookups                 

src/eth/storage/permanent/rocks/rocks_config.rs

  • Removed unused constants
  • Replaced LargeSSTFiles and FastWriteSST configs with
    OptimizedPointLookUp
  • Implemented new configuration options for RocksDB
  • Added prefix extractor and bloom filter settings
  • +32/-121
    rocks_db.rs
    Update DB creation with new config                                             

    src/eth/storage/permanent/rocks/rocks_db.rs

  • Updated create_or_open_db function to use new configuration options
  • +1/-1     
    rocks_permanent.rs
    Remove metrics export from block saving                                   

    src/eth/storage/permanent/rocks/rocks_permanent.rs

    • Removed metrics export from save_block method
    +0/-6     
    rocks_state.rs
    Refactor RocksDB state management                                               

    src/eth/storage/permanent/rocks/rocks_state.rs

  • Removed logs-related functionality
  • Updated column family configurations
  • Made generate_cf_options_map function public
  • Adjusted cache settings and added prefix lengths for some column
    families
  • +8/-21   
    Dependencies
    Cargo.toml
    Add serde feature to revm dependency                                         

    Cargo.toml

  • Added 'serde' feature to revm dependency for both aarch64 and
    non-aarch64 targets
  • +2/-2     

    💡 PR-Agent usage: Comment /help "your question" on any pull request to receive relevant information

    Copy link

    github-actions bot commented Nov 13, 2024

    PR Reviewer Guide 🔍

    (Review updated until commit 674f86c)

    Here are some key observations to aid the review process:

    ⏱️ Estimated effort to review: 4 🔵🔵🔵🔵⚪
    🧪 No relevant tests
    🔒 No security concerns identified
    ⚡ Recommended focus areas for review

    Performance Optimization
    The new configuration for RocksDB is optimized for point lookups. This is a significant change that may affect overall database performance. Careful testing and benchmarking should be done to ensure this change improves performance as intended.

    Removal of Logs Functionality
    The logs-related functionality has been removed from the RocksStorageState struct and associated methods. This is a major change that could affect log retrieval and storage. Ensure that this removal doesn't break any existing functionality that depends on logs.

    Metrics Export Removal
    The metrics export functionality has been removed from the save_block method. This could affect monitoring and debugging capabilities. Verify if this removal is intentional and if there are alternative ways to capture these metrics.

    Copy link

    github-actions bot commented Nov 13, 2024

    PR Code Suggestions ✨

    Latest suggestions up to 674f86c
    Explore these optional code suggestions:

    CategorySuggestion                                                                                                                                    Score
    General
    Use a constant for the bloom filter ratio to improve maintainability and configurability

    Consider using a constant or configuration value for the bloom filter ratio instead
    of hardcoding 15.5.

    src/eth/storage/permanent/rocks/rocks_config.rs [34]

    -block_based_options.set_bloom_filter(15.5, false);
    +const BLOOM_FILTER_RATIO: f64 = 15.5;
    +block_based_options.set_bloom_filter(BLOOM_FILTER_RATIO, false);
    Suggestion importance[1-10]: 6

    Why: Using a constant for the bloom filter ratio improves code maintainability and makes it easier to adjust the value in the future if needed. It's a good practice for configuration values.

    6
    Use a constant for the memtable prefix bloom ratio to improve maintainability and configurability

    Consider using a constant or configuration value for the memtable prefix bloom ratio
    instead of hardcoding 0.15.

    src/eth/storage/permanent/rocks/rocks_config.rs [46]

    -opts.set_memtable_prefix_bloom_ratio(0.15);
    +const MEMTABLE_PREFIX_BLOOM_RATIO: f64 = 0.15;
    +opts.set_memtable_prefix_bloom_ratio(MEMTABLE_PREFIX_BLOOM_RATIO);
    Suggestion importance[1-10]: 6

    Why: Similar to the first suggestion, using a constant for the memtable prefix bloom ratio enhances code maintainability and allows for easier configuration changes in the future.

    6
    Use a constant for the data block hash ratio to improve maintainability and configurability

    Consider using a constant or configuration value for the data block hash ratio
    instead of hardcoding 0.3.

    src/eth/storage/permanent/rocks/rocks_config.rs [60]

    -block_based_options.set_data_block_hash_ratio(0.3);
    +const DATA_BLOCK_HASH_RATIO: f64 = 0.3;
    +block_based_options.set_data_block_hash_ratio(DATA_BLOCK_HASH_RATIO);
    Suggestion importance[1-10]: 6

    Why: Consistent with the previous suggestions, using a constant for the data block hash ratio improves code maintainability and makes it easier to adjust the value if needed in the future.

    6

    Previous suggestions

    Suggestions up to commit 73754b3
    CategorySuggestion                                                                                                                                    Score
    Best practice
    Use anyhow::ensure! for more idiomatic error handling when checking arguments

    Use anyhow::ensure! instead of if and bail! for a more idiomatic error handling
    approach when checking the number of arguments.

    src/bin/rocks_migration.rs [27-29]

    -if args.len() != 3 {
    -    bail!("Expected 2 arguments: source and destination DB paths");
    -}
    +anyhow::ensure!(args.len() == 3, "Expected 2 arguments: source and destination DB paths");
    Suggestion importance[1-10]: 7

    Why: Using anyhow::ensure! is more idiomatic in Rust and provides a cleaner way to handle errors. It combines the condition check and error message in a single line, improving readability.

    7
    Enhancement
    Introduce a constant for the batch size to improve code maintainability

    Use a constant for the batch size (10_000) to improve maintainability and make it
    easier to adjust in the future.

    src/bin/rocks_migration.rs [51-55]

    -if count % 10_000 == 0 {
    +const BATCH_SIZE: usize = 10_000;
    +if count % BATCH_SIZE == 0 {
         tracing::info!("Processed {count} entries for CF {cf_handle}");
         write_in_batch(&dest_state, batch)?;
         batch = WriteBatch::default();
     }
    Suggestion importance[1-10]: 6

    Why: Using a constant for the batch size improves code maintainability and makes it easier to adjust in the future. This is a good practice for magic numbers in code.

    6
    Simplify nested if-let structure using a match arm with a guard for improved readability

    Consider using a match arm with a guard instead of a nested if-let for better
    readability and to avoid unnecessary nesting.

    src/eth/storage/rocks/rocks_state.rs [381-387]

    -BlockFilter::Hash(block_hash) => {
    -    if let Some(block_number) = self.blocks_by_hash.get(&(*block_hash).into())? {
    -        self.blocks_by_number.get(&block_number)
    -    } else {
    -        Ok(None)
    -    }
    -}
    +BlockFilter::Hash(block_hash) => match self.blocks_by_hash.get(&(*block_hash).into())? {
    +    Some(block_number) => self.blocks_by_number.get(&block_number),
    +    None => Ok(None),
    +},
    Suggestion importance[1-10]: 5

    Why: The suggested change simplifies the code structure, making it more readable. However, the improvement is minor and doesn't significantly impact functionality or performance.

    5

    @carneiro-cw carneiro-cw changed the base branch from main to breaking-enha-rocksdb December 17, 2024 04:11
    @carneiro-cw carneiro-cw marked this pull request as ready for review December 17, 2024 04:21
    Copy link

    Persistent review updated to latest commit 674f86c

    @carneiro-cw carneiro-cw merged commit 4fc58a0 into breaking-enha-rocksdb Dec 17, 2024
    34 checks passed
    @carneiro-cw carneiro-cw deleted the rocks_db_config branch December 17, 2024 04:25
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    None yet
    Projects
    None yet
    Development

    Successfully merging this pull request may close these issues.

    1 participant