Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

enha: implement storage cache #1887

Merged
merged 4 commits into from
Dec 2, 2024
Merged

Conversation

marcospb19-cw
Copy link
Contributor

@marcospb19-cw marcospb19-cw commented Nov 28, 2024

PR Type

Enhancement


Description

  • Implemented a new StorageCache system using LRU caches for slots and accounts to improve performance.
  • Refactored StratusStorage to integrate the new caching mechanism, modifying read_account and read_slot methods.
  • Updated the importer_offline logic with new constants for cache size and batch management.
  • Enhanced the Address struct with additional ordering traits.
  • Replaced InMemoryTemporaryAccount with the more versatile AccountWithSlots struct.
  • Added new dependencies (lru, parking_lot, rustc-hash, smallvec) to support the caching implementation.
  • Updated e2e tests to reflect changes in error messages.

Changes walkthrough 📝

Relevant files
Enhancement
importer_offline.rs
Refactor block execution and storage saving logic               

src/bin/importer_offline.rs

  • Replaced STORAGE_SAVER_BATCH_SIZE with new constants for cache size
    and batch management
  • Updated channel capacity for execute_to_save_tx
  • Modified block execution loop to use new SAVER_BATCH_SIZE
  • +12/-16 
    address.rs
    Add ordering traits to Address struct                                       

    src/eth/primitives/address.rs

    • Added PartialOrd and Ord traits to Address struct derivation
    +1/-2     
    cache.rs
    Implement StorageCache for caching slots and accounts       

    src/eth/storage/cache.rs

  • Implemented new StorageCache struct with LRU caches for slots and
    accounts
  • Added methods for caching and retrieving slots and accounts
  • +89/-0   
    mod.rs
    Add StorageCache and modify Storage trait                               

    src/eth/storage/mod.rs

  • Added StorageCache to public exports
  • Modified Storage trait to be Send + Sync + 'static
  • Implemented AccountWithSlots struct
  • +20/-1   
    stratus_storage.rs
    Integrate StorageCache into StratusStorage                             

    src/eth/storage/stratus_storage.rs

  • Added StorageCache to StratusStorage struct
  • Modified read_account and read_slot methods to use cache
  • Updated save_execution to cache changes
  • +89/-64 
    inmemory.rs
    Replace InMemoryTemporaryAccount with AccountWithSlots     

    src/eth/storage/temporary/inmemory.rs

    • Replaced InMemoryTemporaryAccount with AccountWithSlots
    +3/-18   
    Tests
    leader-follower-miner.test.ts
    Update error message assertion in miner mode change test 

    e2e/cloudwalk-contracts/integration/test/leader-follower-miner.test.ts

    • Updated error message assertion in test case
    +3/-1     
    Dependencies
    Cargo.toml
    Add new dependencies for caching implementation                   

    Cargo.toml

    • Added new dependencies: lru, parking_lot, rustc-hash, smallvec
    +4/-0     

    💡 PR-Agent usage: Comment /help "your question" on any pull request to receive relevant information

    Copy link

    github-actions bot commented Nov 28, 2024

    PR Reviewer Guide 🔍

    (Review updated until commit 20189b2)

    Here are some key observations to aid the review process:

    ⏱️ Estimated effort to review: 4 🔵🔵🔵🔵⚪
    🧪 PR contains tests
    🔒 No security concerns identified
    ⚡ Recommended focus areas for review

    Performance Concern
    The new caching mechanism in StratusStorage might introduce overhead for small read operations. The implementation should be reviewed to ensure it doesn't negatively impact performance for frequent, small reads.

    Potential Race Condition
    The use of multiple locks in the cache_account_and_slots_from_changes method could potentially lead to a race condition. The locking strategy should be reviewed to ensure thread safety.

    Unexplained Constant
    The SAVER_CHANNEL_CAPACITY constant is set to BATCH_COUNT - 2 with a TODO comment. This needs to be explained or addressed to ensure the rationale is clear and the value is correct.

    Copy link

    github-actions bot commented Nov 28, 2024

    PR Code Suggestions ✨

    Latest suggestions up to 20189b2
    Explore these optional code suggestions:

    CategorySuggestion                                                                                                                                    Score
    General
    Extract common logic in read methods to reduce duplication

    The read_account and read_slot methods contain very similar logic. Consider
    extracting the common pattern into a separate function to reduce code duplication
    and improve maintainability.

    src/eth/storage/stratus_storage.rs [148-245]

    +fn read_from_storage<T, F, G>(&self, key: impl Display, point_in_time: PointInTime, cache_read: F, temp_read: G) -> Result<T, StratusError>
    +where
    +    F: Fn() -> Option<T>,
    +    G: Fn() -> Result<Option<T>, StratusError>,
    +{
    +    // ... (common logic extracted here)
    +}
    +
     fn read_account(&self, address: Address, point_in_time: PointInTime) -> Result<Account, StratusError> {
    -    // ... (similar logic to read_slot)
    +    self.read_from_storage(address, point_in_time, 
    +        || self.cache.get_account(address),
    +        || self.temp.read_account(address))
     }
     
     fn read_slot(&self, address: Address, index: SlotIndex, point_in_time: PointInTime) -> Result<Slot, StratusError> {
    -    // ... (similar logic to read_account)
    +    self.read_from_storage(format!("{}:{}", address, index), point_in_time,
    +        || self.cache.get_slot(address, index),
    +        || self.temp.read_slot(address, index))
     }
    Suggestion importance[1-10]: 8

    Why: Extracting common logic from read_account and read_slot methods into a separate function would significantly reduce code duplication, improving maintainability and readability of the code. This refactoring aligns with the DRY principle and could make future modifications easier.

    8
    Use RwLock instead of Mutex for cache structures to allow concurrent reads

    Consider using a read-write lock (RwLock) instead of a regular Mutex for slot_cache
    and account_cache. This would allow multiple readers to access the cache
    concurrently, potentially improving performance in read-heavy scenarios.

    src/eth/storage/stratus_storage.rs [16-19]

     pub struct StorageCache {
    -    slot_cache: Mutex<LruCache<(Address, SlotIndex), SlotValue, FxBuildHasher>>,
    -    account_cache: Mutex<LruCache<Address, Account, FxBuildHasher>>,
    +    slot_cache: RwLock<LruCache<(Address, SlotIndex), SlotValue, FxBuildHasher>>,
    +    account_cache: RwLock<LruCache<Address, Account, FxBuildHasher>>,
     }
    Suggestion importance[1-10]: 7

    Why: Using RwLock instead of Mutex for cache structures could potentially improve performance in read-heavy scenarios by allowing multiple concurrent reads. This change aligns well with the caching purpose and could provide a significant performance boost.

    7

    Previous suggestions

    Suggestions up to commit fe7b2f9
    CategorySuggestion                                                                                                                                    Score
    Possible issue
    Ensure consistency between cached slots and the eviction queue by adding missing method calls

    Add the missing method call to self.slot_eviction_queue.write().push() after
    inserting a slot into self.cached_slots to maintain consistency between the two data
    structures.

    src/eth/storage/cache.rs [58-61]

     for slot in slots {
    -    self.cached_slots.insert((slot.index, change.address), (slot.value, self.next_id()));
    -    self.slot_eviction_queue
    +    let id = self.next_id();
    +    self.cached_slots.insert((slot.index, change.address), (slot.value, id));
    +    self.slot_eviction_queue.write().push((id, slot.index, change.address));
     }
    Suggestion importance[1-10]: 9

    Why: This suggestion addresses a critical issue where the slot eviction queue is not being updated when new slots are cached. This inconsistency could lead to memory leaks or incorrect eviction behavior, significantly impacting the cache's functionality.

    9
    Implement proper error handling for storage read operations

    Implement proper error handling for the read_account and read_slot methods instead
    of silently ignoring errors.

    src/eth/storage/stratus_storage.rs [175-177]

     if let Err(ref e) = m.result {
    +    return Err(StratusError::StorageError(e.to_string()));
     }
    Suggestion importance[1-10]: 8

    Why: Proper error handling is crucial for robust software. The current implementation silently ignores errors, which could lead to unexpected behavior or data inconsistencies. Implementing proper error handling improves the system's reliability and debuggability.

    8
    Align the MAX_BLOCKS constant with the value used in other parts of the system

    Update the MAX_BLOCKS constant to match the value used in other parts of the code
    (86400) to ensure consistency in block retention across the system.

    src/eth/storage/temporary/inmemory.rs [31]

    -const MAX_BLOCKS: usize = 1;
    +const MAX_BLOCKS: usize = 86_400;
    Suggestion importance[1-10]: 7

    Why: Consistency in configuration values across the system is important for predictable behavior. Changing MAX_BLOCKS from 1 to 86,400 aligns it with other parts of the code, potentially improving the system's overall coherence and preventing unexpected behaviors due to mismatched constants.

    7

    @marcospb19-cw marcospb19-cw force-pushed the enha-implement-storage-cache branch from fe7b2f9 to ae91f49 Compare November 28, 2024 15:07
    @gabriel-aranha-cw
    Copy link
    Contributor

    Benchmark:
    RPS Stats: Max: 1187.00, Min: 856.00, Avg: 1017.41, StdDev: 55.11
    TPS Stats: Max: 1169.00, Min: 783.00, Avg: 985.68, StdDev: 78.30
    Plot: View Plot

    @gabriel-aranha-cw
    Copy link
    Contributor

    Benchmark:
    Run ID: bench-1218196611

    Git Info:

    RPS Stats: Max: 1204.00, Min: 731.00, Avg: 1017.23, StdDev: 53.77
    TPS Stats: Max: 1158.00, Min: 875.00, Avg: 986.28, StdDev: 68.03

    Plot: View Plot

    @marcospb19-cw marcospb19-cw force-pushed the enha-implement-storage-cache branch from ae91f49 to f8cd0c6 Compare November 28, 2024 18:00
    @gabriel-aranha-cw
    Copy link
    Contributor

    Benchmark:
    Run ID: bench-1846717272

    Git Info:

    RPS Stats: Max: 1186.00, Min: 691.00, Avg: 986.88, StdDev: 53.79
    TPS Stats: Max: 1142.00, Min: 812.00, Avg: 956.96, StdDev: 66.99

    Plot: View Plot

    @marcospb19-cw marcospb19-cw force-pushed the enha-implement-storage-cache branch 4 times, most recently from 89c140f to 3fb250d Compare November 29, 2024 13:14
    @marcospb19-cw marcospb19-cw marked this pull request as ready for review November 29, 2024 17:20
    Copy link

    Persistent review updated to latest commit 20189b2

    @marcospb19-cw
    Copy link
    Contributor Author

    /benchmark

    @gabriel-aranha-cw
    Copy link
    Contributor

    Benchmark:
    Run ID: bench-2222931823

    Git Info:

    Configuration:

    • Target Account Strategy: Default (Random)

    RPS Stats: Max: 1306.00, Min: 766.00, Avg: 1144.55, StdDev: 70.56
    TPS Stats: Max: 1290.00, Min: 977.00, Avg: 1110.22, StdDev: 75.20

    Plot: View Plot

    @marcospb19-cw
    Copy link
    Contributor Author

    /benchmark --target-account-strategy=partial0.5

    @gabriel-aranha-cw
    Copy link
    Contributor

    Benchmark:
    Run ID: bench-3886051290

    Git Info:

    Configuration:

    • Target Account Strategy: Override (Partial(0.5))

    RPS Stats: Max: 1407.00, Min: 1012.00, Avg: 1249.07, StdDev: 58.61
    TPS Stats: Max: 1408.00, Min: 1031.00, Avg: 1211.08, StdDev: 70.40

    Plot: View Plot

    @marcospb19-cw
    Copy link
    Contributor Author

    /benchmark --target-account-strategy=partial0.5

    @gabriel-aranha-cw
    Copy link
    Contributor

    Benchmark:
    Run ID: bench-1817561765

    Git Info:

    Configuration:

    • Target Account Strategy: Override (Partial(0.5))

    RPS Stats: Max: 1369.00, Min: 941.00, Avg: 1252.77, StdDev: 54.82
    TPS Stats: Max: 1370.00, Min: 1049.00, Avg: 1214.84, StdDev: 65.89

    Plot: View Plot

    @marcospb19-cw marcospb19-cw force-pushed the enha-implement-storage-cache branch 2 times, most recently from def4b07 to 2b950ea Compare November 29, 2024 21:16
    @marcospb19-cw
    Copy link
    Contributor Author

    /benchmark --target-account-strategy=partial0.5

    which replaces the previous temporary storage linear search with
    an optimized LRU cache search, this improves TPS and importer-offline
    speed by around 10%
    @marcospb19-cw marcospb19-cw force-pushed the enha-implement-storage-cache branch from 2b950ea to 697b357 Compare December 2, 2024 17:12
    @gabriel-aranha-cw
    Copy link
    Contributor

    Benchmark:
    Run ID: bench-639153299

    Git Info:

    Configuration:

    • Target Account Strategy: Override (Partial(0.5))

    RPS Stats: Max: 1343.00, Min: 728.00, Avg: 1224.79, StdDev: 58.63
    TPS Stats: Max: 1343.00, Min: 1003.00, Avg: 1188.41, StdDev: 64.57

    Plot: View Plot

    @marcospb19-cw marcospb19-cw merged commit 3873279 into main Dec 2, 2024
    34 checks passed
    @marcospb19-cw marcospb19-cw deleted the enha-implement-storage-cache branch December 2, 2024 17:28
    @gabriel-aranha-cw
    Copy link
    Contributor

    Final benchmark:
    Run ID: bench-220247788

    Git Info:

    Configuration:

    • Target Account Strategy: Default (Random)

    RPS Stats: Max: 1323.00, Min: 750.00, Avg: 1119.25, StdDev: 71.85
    TPS Stats: Max: 1299.00, Min: 932.00, Avg: 1085.41, StdDev: 78.60

    Plot: View Plot

    @marcospb19-cw
    Copy link
    Contributor Author

    /benchmark --target-account-strategy=partial0.5

    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    None yet
    Projects
    None yet
    Development

    Successfully merging this pull request may close these issues.

    2 participants