Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: postgres temporary storage #769

Closed
wants to merge 8 commits into from
Closed

Conversation

carneiro-cw
Copy link
Contributor

@carneiro-cw carneiro-cw commented May 3, 2024

WIP

Copy link

github-actions bot commented May 3, 2024

PR Review

⏱️ Estimated effort to review [1-5]

4, due to the extensive changes across multiple files involving complex logic related to database operations, storage configurations, and error handling mechanisms. The PR introduces a new temporary storage system for PostgreSQL, modifies existing storage configurations, and includes detailed SQL queries which require careful review to ensure correctness and efficiency.

🧪 Relevant tests

No

🔍 Possible issues

Possible Bug: The use of .unwrap_or_else in PostgresPermanentStorage::increment_block_number might lead to panics in production if the SQL query fails. It's generally safer to handle the error explicitly.

Performance Concern: The method PostgresTemporaryStorage::save_account_changes performs a complex SQL operation with multiple nested queries. This could be inefficient, especially with large volumes of data. Consider optimizing the SQL queries or handling some logic in the application layer to reduce database load.

🔒 Security concerns

No

Code feedback:
relevant filesrc/config.rs
suggestion      

Consider adding error handling for the clone operation on temp_storage_kind to prevent potential panics if the enum cannot be cloned due to future changes in its structure. [important]

relevant linematch self.temp_storage_kind.clone() {

relevant filesrc/eth/storage/postgres/postgres_temporary.rs
suggestion      

Implement logging for successful database connections and detailed error messages for connection failures to aid in debugging and monitoring the storage system. [important]

relevant lineErr(e) => return log_and_err!(reason = e, "failed to start postgres permanent storage"),

relevant filesrc/eth/storage/postgres/postgres_temporary.rs
suggestion      

Optimize the SQL query in save_account_changes by reducing the complexity of the nested transactions and using more efficient SQL constructs. This could improve performance and maintainability. [important]

relevant linesqlx::query!(

relevant filesrc/eth/storage/postgres/postgres_temporary.rs
suggestion      

Add transaction rollback mechanisms in the reset method to ensure that all operations are atomic and can be reverted in case of errors, enhancing the robustness of the temporary storage reset functionality. [important]

relevant linetokio::try_join!(truncate_future, block_number_future)?;


✨ Review tool usage guide:

Overview:
The review tool scans the PR code changes, and generates a PR review which includes several types of feedbacks, such as possible PR issues, security threats and relevant test in the PR. More feedbacks can be added by configuring the tool.

The tool can be triggered automatically every time a new PR is opened, or can be invoked manually by commenting on any PR.

  • When commenting, to edit configurations related to the review tool (pr_reviewer section), use the following template:
/review --pr_reviewer.some_config1=... --pr_reviewer.some_config2=...
[pr_reviewer]
some_config1=...
some_config2=...

See the review usage page for a comprehensive guide on using this tool.

Copy link

github-actions bot commented May 3, 2024

PR Code Suggestions

CategorySuggestions                                                                                                                                                       
Enhancement
Improve error messaging for unknown temporary storage types.

Consider using a more specific error message when handling the Postgres variant in the
FromStr implementation for TemporaryStorageKind. This will help in debugging and
understanding the error context better.

src/config.rs [613-614]

 s if s.starts_with("postgres://") => Ok(Self::Postgres { url: s.to_string() }),
-s => Err(anyhow!("unknown temporary storage: {}", s)),
+s => Err(anyhow!("unknown temporary storage for '{}', expected 'inmemory', 'rocks', or a 'postgres://' URL", s)),
 
Improve error handling for missing fields when converting ExecutionAccountChanges to Account.

To ensure that all fields are properly handled, add error handling for cases where take()
might return None for fields in ExecutionAccountChanges when converting to Account.

src/eth/primitives/execution_account_changes.rs [125-127]

-nonce: value.nonce.take().context("No nonce in changes")?,
-balance: value.balance.take().context("No balance in changes")?,
-bytecode: value.bytecode.take().context("No bytecode in changes")?,
+nonce: value.nonce.take().ok_or_else(|| anyhow!("Nonce is required but not present in changes"))?,
+balance: value.balance.take().ok_or_else(|| anyhow!("Balance is required but not present in changes"))?,
+bytecode: value.bytecode.take().ok_or_else(|| anyhow!("Bytecode is required but not present in changes"))?,
 
Improve error handling by replacing unwrap_or_else with proper error propagation.

Consider handling potential errors from fetch_one method instead of using unwrap_or_else
to provide a more robust error handling mechanism. This will prevent the application from
panicking and allow it to handle errors more gracefully.

src/eth/storage/postgres/postgres_permanent.rs [116-117]

 .fetch_one(&self.pool)
-.await
-.unwrap_or_else(|err| {
+.await?
 
Remove unnecessary cloning of the pool object.

Replace the clone() call with a direct use of the pool variable since PgPool already
implements Clone internally and it's cheap to clone because it's an Arc under the hood.

src/eth/storage/postgres/postgres_temporary.rs [42]

-Ok(pool) => pool.clone(),
+Ok(pool) => pool,
 
Improve error handling for type conversion to provide more detailed error information.

Consider handling the potential error from the try_into conversion explicitly instead of
using ?, which could provide more detailed error information.

src/eth/storage/postgres/postgres_temporary.rs [67-68]

-.map(TryInto::try_into)
+.map(|val| val.try_into().map_err(|e| anyhow::anyhow!("Failed conversion: {}", e)))
 .transpose()
 
Wrap the database operations in a transaction to ensure atomicity.

Use a transaction to ensure that all changes are committed atomically. This prevents
partial updates which can lead to inconsistent states in case of errors.

src/eth/storage/postgres/postgres_temporary.rs [231-233]

+let transaction = self.pool.begin().await?;
 sqlx::query!(...)
-.execute(&self.pool)
+.execute(&transaction)
 .await?;
+transaction.commit().await?;
 
Maintainability
Refactor the creation of PostgresTemporaryStorage into a separate function.

For better error handling and future maintenance, consider separating the creation of
PostgresTemporaryStorage into a separate asynchronous function. This will make the init
function cleaner and more modular.

src/config.rs [593-600]

-TemporaryStorageKind::Postgres { url } => Ok(Arc::new(
-    PostgresTemporaryStorage::new(PostgresTemporaryStorageConfig {
-        url,
-        connections: self.temp_storage_connections,
-        acquire_timeout: Duration::from_millis(self.temp_storage_timeout_millis),
-    })
-    .await?,
-)),
+TemporaryStorageKind::Postgres { url } => self.create_postgres_temporary_storage(url).await,
 
Replace hard-coded SQL paths with configurable paths for better maintainability.

Replace the hard-coded SQL file paths with a configuration setting or constant to make the
codebase more maintainable and flexible to changes in the directory structure.

src/eth/storage/postgres/postgres_permanent.rs [114]

-sqlx::query_file_scalar!("src/eth/storage/postgres/sql/permanent/select_current_block_number.sql")
+sqlx::query_file_scalar!(config::SQL_PATHS.current_block_number)
 
Refactor large SQL block into smaller, more manageable functions.

Refactor the large SQL query block to use smaller, well-named functions or separate it
into multiple queries to improve maintainability and readability.

src/eth/storage/postgres/postgres_temporary.rs [231-233]

-sqlx::query!(...)
-.execute(&self.pool)
-.await?;
+self.update_slots(&slots_batch)?;
+self.update_accounts(&account_batch)?;
 
Bug
Handle potential overflow when converting milliseconds to Duration.

It's recommended to handle potential overflow when converting temp_storage_timeout_millis
to Duration. Use Duration::from_millis safely to avoid panic in case of overflow.

src/config.rs [597]

-acquire_timeout: Duration::from_millis(self.temp_storage_timeout_millis),
+acquire_timeout: Duration::from_millis(self.temp_storage_timeout_millis.checked_add(1).unwrap_or(u64::MAX)),
 
Correct the fields compared in the WHERE clause of the ON CONFLICT section.

The WHERE clause in the ON CONFLICT section uses excluded.previous_balance for both
latest_nonce and latest_balance comparisons. This seems incorrect as it should likely
compare latest_nonce with excluded.previous_nonce and latest_balance with
excluded.previous_balance.

src/eth/storage/postgres/sql/permanent/insert_account.sql [29]

-WHERE accounts.latest_nonce = excluded.previous_balance AND accounts.latest_balance = excluded.previous_balance
+WHERE accounts.latest_nonce = excluded.previous_nonce AND accounts.latest_balance = excluded.previous_balance
 
Best practice
Rename modules to adhere to Rust's naming conventions.

Consider renaming postgres_permanent and postgres_temporary modules to follow Rust's
naming conventions, which prefer snake_case for file names.

src/eth/storage/postgres/mod.rs [2-4]

-mod postgres_permanent;
-mod postgres_temporary;
+mod postgres_permanent_storage;
+mod postgres_temporary_storage;
 
Use fetch_optional for queries that might return no results.

Use fetch_optional instead of fetch_one for queries that might not return a result to
explicitly handle the possibility of no results found, which is more idiomatic in Rust for
optional data.

src/eth/storage/postgres/postgres_permanent.rs [116]

-.fetch_one(&self.pool)
+.fetch_optional(&self.pool)
 
Use pattern matching to handle optional query results more safely.

Utilize Rust's type system to handle potential None values from fetch_optional more safely
by using pattern matching, which can provide more control over the flow of data and error
handling.

src/eth/storage/postgres/postgres_permanent.rs [356]

-.fetch_optional(&self.pool);
+match .fetch_optional(&self.pool).await {
+    Ok(Some(data)) => data,
+    Ok(None) => return Err(anyhow::Error::msg("No data found")),
+    Err(e) => return Err(e.into()),
+}
 
Improve handling of missing original values to avoid silent defaults.

Instead of using unwrap_or_default() which might hide potential issues, handle the absence
of original values explicitly or log a warning.

src/eth/storage/postgres/postgres_temporary.rs [125]

-let original_value = original.unwrap_or_default().value;
+let original_value = original.map_or_else(|| {
+    tracing::warn!("Original value not found, using default.");
+    Default::default()
+}, |orig| orig.value);
 
Performance
Optimize SQL queries by batching or combining them where possible.

For better performance and reduced latency, consider batching similar SQL queries or using
a single compound SQL query instead of multiple separate queries in scenarios where
multiple related data items are fetched sequentially.

src/eth/storage/postgres/postgres_permanent.rs [353-363]

-let header_query = sqlx::query_file_as!(
-    BlockHeader,
-    "src/eth/storage/postgres/sql/permanent/select_block_header_by_number.sql",
+let (header, transactions) = sqlx::query_file_as!(
+    (BlockHeader, Vec<PostgresTransaction>),
+    "src/eth/storage/postgres/sql/permanent/select_header_and_transactions_by_block_number.sql",
     block_number as _
 )
-.fetch_optional(&self.pool);
-let transactions_query = sqlx::query_file_as!(
-    PostgresTransaction,
-    "src/eth/storage/postgres/sql/permanent/select_transactions_by_block_number.sql",
-    block_number as _
-)
-.fetch_all(&self.pool);
+.fetch_one(&self.pool).await?;
 

✨ Improve tool usage guide:

Overview:
The improve tool scans the PR code changes, and automatically generates suggestions for improving the PR code. The tool can be triggered automatically every time a new PR is opened, or can be invoked manually by commenting on a PR.

  • When commenting, to edit configurations related to the improve tool (pr_code_suggestions section), use the following template:
/improve --pr_code_suggestions.some_config1=... --pr_code_suggestions.some_config2=...
[pr_code_suggestions]
some_config1=...
some_config2=...

See the improve usage page for a comprehensive guide on using this tool.

@carneiro-cw carneiro-cw deleted the postgres_temporary branch May 24, 2024 17:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants