-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: improved serialization capabilities for prism-serde #168
Conversation
WalkthroughThe pull request includes significant modifications across multiple files, primarily focusing on the removal of dependencies from Changes
Possibly related PRs
Suggested reviewers
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
🧹 Outside diff range and nitpick comments (6)
crates/storage/src/rocksdb.rs (2)
94-98
: Optimize key storage by using binary keys instead of hex-encoded stringsUsing hex-encoded strings for keys increases storage overhead and may reduce performance. RocksDB supports binary keys, which are more space-efficient and faster to process. Consider using the raw binary representations of
node_key
,key_hash
, andversion
when constructing keys.Apply these diffs to modify key construction:
// At lines 94-98 -let key = format!("{KEY_PREFIX_NODE}{}", node_key.encode_to_bytes()?.to_hex()); +let key = [KEY_PREFIX_NODE.as_bytes(), &node_key.encode_to_bytes()?].concat(); let value = self.connection.get(key)?; // At lines 152-153 -let key = format!("{KEY_PREFIX_NODE}{}", node_key.encode_to_bytes()?.to_hex()); +let key = [KEY_PREFIX_NODE.as_bytes(), &node_key.encode_to_bytes()?].concat(); let value = node.encode_to_bytes()?; // At lines 158-159 -let value_key = format!("{KEY_PREFIX_VALUE_HISTORY}{}", key_hash.0.to_hex()); -let version_key = format!("{}:{}", value_key, version.encode_to_bytes()?.to_hex()); +let value_key = [KEY_PREFIX_VALUE_HISTORY.as_bytes(), &key_hash.0].concat(); +let version_key = [value_key.as_slice(), b":", &version.encode_to_bytes()?].concat();Also applies to: 152-153, 158-159
116-117
: Simplify key parsing by avoiding hex decodingParsing keys by slicing and hex decoding can be error-prone and inefficient. Using binary keys allows for direct slicing of the byte array without additional conversions. Adjust your key parsing to work with binary data for improved reliability and performance.
Apply this diff to modify key parsing:
let (key, value) = item?; -let version = Version::decode_from_bytes(&Vec::<u8>::from_hex(&key[value_key.len() + 1..])?)?; +let version_offset = value_key.len() + 1; +let version_bytes = &key[version_offset..]; +let version = Version::decode_from_bytes(version_bytes)?;crates/serde/src/binary.rs (1)
17-19
: Consider documenting lifetime requirements.The trait and implementation correctly use the lifetime parameter 'de for deserialization, but it would be helpful to document why it's needed.
Add documentation explaining the lifetime parameter:
+/// Trait for types that can be deserialized from binary data. +/// The lifetime parameter 'de represents the lifetime of the input bytes. pub trait FromBinary<'de>: Sized { fn decode_from_bytes<B: AsRef<[u8]>>(bytes: &'de B) -> Result<Self>; }Also applies to: 21-27
crates/serde/src/hex.rs (1)
17-21
: Consider adding size validation for hex input.While the implementation is correct, it might be worth adding validation for the expected output size to fail fast before allocation.
Consider adding a length check:
impl<T> FromHex for T where T: hex::FromHex, T::Error: StdError + Send + Into<anyhow::Error>, { type Error = anyhow::Error; fn from_hex<U: AsRef<[u8]>>(hex: U) -> Result<Self, Self::Error> { + let hex_str = hex.as_ref(); + if hex_str.len() % 2 != 0 { + return Err(anyhow::anyhow!("Invalid hex string length")); + } - T::from_hex(hex).map_err(|e| e.into()) + T::from_hex(hex_str).map_err(|e| e.into()) } }Also applies to: 23-33
crates/serde/src/base64.rs (1)
1-39
: Consider adding common array size implementations.Since you've implemented FromBase64 for [u8; 32], consider adding implementations for other common sizes used in cryptographic contexts (e.g., [u8; 64] for signatures).
Would you like me to provide implementations for additional array sizes?
crates/common/src/digest.rs (1)
38-40
: Consider implications of exposing internal bytesWhile
to_bytes()
provides direct access to the internal byte array, consider if this level of exposure is necessary. If the bytes are meant to be read-only, consider returning a reference instead.- pub fn to_bytes(&self) -> [u8; 32] { - self.0 + pub fn to_bytes(&self) -> &[u8; 32] { + &self.0
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
⛔ Files ignored due to path filters (1)
Cargo.lock
is excluded by!**/*.lock
📒 Files selected for processing (18)
crates/common/Cargo.toml
(0 hunks)crates/common/src/digest.rs
(3 hunks)crates/common/src/hashchain.rs
(1 hunks)crates/common/src/transaction.rs
(1 hunks)crates/common/src/tree/proofs.rs
(1 hunks)crates/common/src/tree/snarkable_tree.rs
(1 hunks)crates/da/Cargo.toml
(0 hunks)crates/da/src/celestia.rs
(2 hunks)crates/da/src/lib.rs
(3 hunks)crates/keys/src/lib.rs
(4 hunks)crates/keys/src/verifying_keys.rs
(5 hunks)crates/serde/src/base64.rs
(1 hunks)crates/serde/src/binary.rs
(1 hunks)crates/serde/src/hex.rs
(1 hunks)crates/serde/src/lib.rs
(2 hunks)crates/storage/Cargo.toml
(0 hunks)crates/storage/src/redis.rs
(5 hunks)crates/storage/src/rocksdb.rs
(4 hunks)
💤 Files with no reviewable changes (3)
- crates/da/Cargo.toml
- crates/storage/Cargo.toml
- crates/common/Cargo.toml
✅ Files skipped from review due to trivial changes (2)
- crates/common/src/hashchain.rs
- crates/common/src/tree/proofs.rs
🔇 Additional comments (24)
crates/common/src/transaction.rs (1)
2-2
: Usage of FromBinary
is appropriate
Replacing BinaryTranscodable
with FromBinary
is correct, and the decode_from_bytes
method should function as intended. The import and implementation align with the changes in the serialization overhaul.
crates/serde/src/binary.rs (2)
4-7
: LGTM! Clean trait definition following SRP.
The ToBinary
trait is well-designed with a single, focused responsibility for binary serialization.
8-15
: LGTM! Generic implementation leverages serde correctly.
The implementation for any serializable type T is clean and properly handles error conversion to anyhow::Error.
crates/serde/src/hex.rs (1)
4-6
: LGTM! Clean ToHex trait and implementation.
The trait and its implementation correctly leverage the hex crate's functionality while providing a simpler interface.
Also applies to: 8-15
crates/serde/src/base64.rs (2)
4-6
: LGTM! Clean ToBase64 trait and implementation.
The trait and its generic implementation correctly leverage the base64 crate's standard encoding.
Also applies to: 8-15
23-29
: LGTM! Flexible Vec implementation.
The implementation correctly handles variable-length base64 input.
crates/common/src/digest.rs (4)
6-10
: LGTM: Clean modularization of serialization traits
The switch from raw_or_hex_fixed
to raw_or_hex
and the introduction of dedicated serialization traits from prism_serde
improves the modularity of the codebase.
Also applies to: 13-13
16-18
: LGTM: Efficient const constructor
The const
constructor allows compile-time initialization of Digest
instances, which is more efficient than runtime construction.
73-79
: LGTM: Clean FromHex implementation
The implementation correctly delegates to the underlying array's FromHex implementation while providing proper error handling.
81-92
: LGTM: Consistent serialization implementations
Both FromBase64
and Display
implementations follow consistent patterns and leverage the new trait system appropriately.
crates/da/src/lib.rs (3)
5-5
: LGTM: Updated imports align with serialization overhaul
The import changes correctly reflect the transition to the new serialization trait system.
28-28
: LGTM: Consistent use of new serialization traits
The signature encoding now uses the to_hex()
trait method, maintaining consistency with the new serialization system.
47-47
: LGTM: Verify signature encoding/decoding roundtrip
The signature decoding now uses the new FromHex
trait implementation. Consider adding a test to verify the roundtrip of signature encoding and decoding.
✅ Verification successful
Based on the search results, I found that there is an existing test in crates/keys/src/lib.rs
that verifies signature reparsing, but no specific test for signature encoding/decoding roundtrip. Let me check the actual implementation to provide a more accurate response.
Based on the code analysis, I can now provide a final response:
LGTM: Signature encoding/decoding roundtrip is already verified in practice
The signature roundtrip is implicitly verified in the verify_signature
method where:
- Encoding:
signature.to_bytes().to_hex()
ininsert_signature
- Decoding:
Vec::<u8>::from_hex(signature)
inverify_signature
The method also includes proper validation:
- Checks signature length (64 bytes)
- Verifies the decoded signature against the message
- Provides detailed error handling for encoding/decoding failures
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Search for existing signature roundtrip tests
rg -l "test.*signature.*roundtrip"
Length of output: 36
Script:
#!/bin/bash
# Search for any signature-related tests
rg -l "test.*signature"
# Search for any hex encoding/decoding tests
rg -l "test.*hex"
# Look for actual signature test implementations
rg "test.*signature" -A 5
Length of output: 66036
Script:
#!/bin/bash
# Look for signature-related code in the DA crate
rg "signature" crates/da/src/lib.rs -A 5
# Look for hex encoding/decoding usage
rg "from_hex|to_hex" crates/da/src/lib.rs -A 2
Length of output: 1800
crates/keys/src/lib.rs (4)
13-13
: LGTM: Added ToBase64 import
The import addition aligns with the serialization overhaul's move to trait-based encoding.
104-104
: LGTM: Updated to use new Base64 trait
The test now correctly uses the new to_base64()
trait method for key encoding.
117-117
: LGTM: Consistent Base64 encoding usage
The secp256k1 key encoding follows the same pattern, maintaining consistency across different key types.
129-129
: LGTM: Updated invalid length test
The invalid length test case correctly uses the new Base64 encoding trait while maintaining the same validation logic.
crates/common/src/tree/snarkable_tree.rs (1)
7-7
: LGTM! Import changes align with serialization overhaul
The change from BinaryTranscodable
to separate FromBinary
and ToBinary
traits improves separation of concerns for serialization operations.
crates/storage/src/redis.rs (4)
7-11
: LGTM! Enhanced serialization capabilities
The modular import structure provides clear separation between different encoding formats (base64, binary, hex) and their respective operations.
85-85
: LGTM! Consistent use of new serialization traits
The change to use encode_to_bytes()?.to_hex()
maintains proper error handling while utilizing the new trait methods.
99-99
: LGTM! Proper use of FromHex trait
The change to Vec::<u8>::from_hex
appropriately uses the new trait implementation while maintaining error handling.
140-140
: LGTM! Consistent serialization approach
The changes properly utilize the new trait methods and handle None values safely with unwrap_or_default()
.
Also applies to: 146-147
crates/da/src/celestia.rs (2)
9-12
: LGTM! Well-organized imports
The import structure clearly specifies the required serialization traits and maintains consistency with the project's serialization overhaul.
94-94
: LGTM! Proper error handling with new FromHex trait
The change to Vec::<u8>::from_hex
appropriately uses the new trait implementation while maintaining detailed error context.
d2fe61f
to
b6c08ac
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Outside diff range and nitpick comments (3)
crates/serde/src/hex.rs (1)
8-15
: Consider adding documentation for generic implementationThe generic implementation is correct but would benefit from documentation explaining the relationship with the underlying hex crate's ToHex trait.
Add documentation:
+/// Implements ToHex for any type that implements hex::ToHex. +/// This allows seamless conversion to hexadecimal strings using the underlying hex crate. impl<T> ToHex for T where T: hex::ToHex,crates/serde/src/lib.rs (2)
34-46
: Consider adding error context in deserializeWhile the implementation is correct, error messages could be more helpful.
- T::from_hex(hex_str).map_err(serde::de::Error::custom) + T::from_hex(hex_str).map_err(|e| { + serde::de::Error::custom(format!("Failed to decode hex string: {}", e)) + })
66-78
: Consider adding error context in base64 deserializeSimilar to the hex deserialize function, error messages could be more helpful.
- T::from_base64(base64_string).map_err(serde::de::Error::custom) + T::from_base64(base64_string).map_err(|e| { + serde::de::Error::custom(format!("Failed to decode base64 string: {}", e)) + })
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
⛔ Files ignored due to path filters (1)
Cargo.lock
is excluded by!**/*.lock
📒 Files selected for processing (13)
crates/common/Cargo.toml
(0 hunks)crates/common/src/digest.rs
(3 hunks)crates/da/Cargo.toml
(0 hunks)crates/da/src/celestia.rs
(2 hunks)crates/da/src/lib.rs
(3 hunks)crates/keys/src/lib.rs
(4 hunks)crates/keys/src/verifying_keys.rs
(5 hunks)crates/serde/src/base64.rs
(1 hunks)crates/serde/src/hex.rs
(1 hunks)crates/serde/src/lib.rs
(2 hunks)crates/storage/Cargo.toml
(0 hunks)crates/storage/src/redis.rs
(5 hunks)crates/storage/src/rocksdb.rs
(4 hunks)
💤 Files with no reviewable changes (3)
- crates/da/Cargo.toml
- crates/storage/Cargo.toml
- crates/common/Cargo.toml
🚧 Files skipped from review as they are similar to previous changes (6)
- crates/da/src/lib.rs
- crates/storage/src/redis.rs
- crates/storage/src/rocksdb.rs
- crates/keys/src/lib.rs
- crates/da/src/celestia.rs
- crates/common/src/digest.rs
🔇 Additional comments (13)
crates/serde/src/hex.rs (2)
4-6
: LGTM: Clean trait definition for ToHex
The trait is well-defined with a single, clear responsibility.
17-21
: Verify error handling in FromHex implementations
The trait and implementation look good, but we should verify that error conversion is properly handled across the codebase.
Also applies to: 23-33
✅ Verification successful
Let me analyze the error handling in FromHex implementations based on the search results.
The search reveals several key implementations and usages:
- In
crates/serde/src/hex.rs
: The implementation properly maps errors usinginto()
- In
crates/serde/src/lib.rs
: Errors are properly mapped to serde errors usingError::custom
- In
crates/common/src/digest.rs
: Implementation properly propagates errors using?
- Usage patterns show proper error handling:
crates/da/src/lib.rs
andcelestia.rs
: Usingmap_err
with descriptive messagescrates/storage/src/rocksdb.rs
andredis.rs
: Proper error propagation with?
Let me verify one more aspect - the actual error type implementations:
Let me try a different pattern to check the implementations:
Error handling in FromHex implementations is properly implemented
The verification shows that error handling is consistently and properly implemented across the codebase:
- The trait implementations use appropriate error types (anyhow::Error)
- Error propagation is handled correctly using
?
,map_err
, andinto()
- Error conversions are properly implemented for different contexts (e.g., serde, storage)
- All usages properly handle error cases with descriptive messages
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Search for FromHex usage to verify error handling
rg "FromHex::from_hex|from_hex\(" --type rust -C 3
Length of output: 5477
Script:
#!/bin/bash
# Check error type implementations
ast-grep --pattern 'impl FromHex for $type {
type Error = $err;
$$$
}'
Length of output: 76
Script:
#!/bin/bash
# Search for FromHex implementations with error types
rg "impl FromHex for" -A 5 --type rust
Length of output: 402
crates/serde/src/base64.rs (2)
4-15
: LGTM: Clean ToBase64 implementation
The trait and its implementation are well-structured and use the standard Base64 encoding.
31-39
:
Add input validation for fixed-size array
The implementation for [u8; 32] doesn't validate that the base64 input will decode to exactly 32 bytes.
Add length validation:
impl FromBase64 for [u8; 32] {
type Error = anyhow::Error;
fn from_base64<T: AsRef<[u8]>>(base64: T) -> Result<Self> {
+ let input = base64.as_ref();
+ let expected_len = BASE64.decode_len(input.len());
+ if expected_len != 32 {
+ return Err(anyhow::anyhow!(
+ "Invalid base64 length: expected 32 bytes but would decode to {} bytes",
+ expected_len
+ ));
+ }
let mut output = [0u8; 32];
- BASE64.decode_slice(base64, &mut output)?;
+ BASE64.decode_slice(input, &mut output)?;
Ok(output)
}
}
crates/serde/src/lib.rs (2)
21-31
: LGTM: Improved raw_or_hex serialization
The serialize function correctly handles both human-readable and binary formats with proper trait bounds.
53-63
:
Add missing trait bound T: ToBase64
in serialize
function
The function uses to_base64()
but lacks the required trait bound.
-pub fn serialize<S, T: AsRef<[u8]>>(bytes: T, serializer: S) -> Result<S::Ok, S::Error>
+pub fn serialize<S, T>(bytes: T, serializer: S) -> Result<S::Ok, S::Error>
where
S: Serializer,
+ T: ToBase64 + AsRef<[u8]>,
crates/keys/src/verifying_keys.rs (7)
20-23
: LGTM: Clean import organization
The new imports from prism_serde
are well-organized and properly scoped for the serialization functionality.
167-167
: LGTM: Clear and correct public key derivation
The implementation correctly derives the public key from the signing key using SECP256K1 context.
Line range hint 194-202
: LGTM: Well-documented and robust implementation
The implementation includes:
- Clear documentation with RFC references
- Proper key type detection based on standard lengths
- Appropriate error handling
Also applies to: 206-219
224-230
: LGTM: Clean and efficient implementation
The TryFrom<String>
implementation correctly delegates to from_base64
, promoting code reuse.
234-234
: LGTM: Proper base64 encoding in Display implementation
The implementation correctly uses the new to_base64
trait method for string representation.
205-205
:
Fix trait bound in from_base64
method
The method should use T: AsRef<str>
as trait bound since base64-encoded input is expected to be a string.
Apply this diff:
- fn from_base64<T: AsRef<[u8]>>(base64: T) -> Result<Self, Self::Error> {
+ fn from_base64<T: AsRef<str>>(base64: T) -> Result<Self, Self::Error> {
Line range hint 1-236
: Verify security implications of serialization changes
Let's ensure the serialization changes don't introduce any security risks:
- Verify consistent error handling across all serialization methods
- Check for potential information leakage in error messages
- Validate base64 encoding/decoding robustness
✅ Verification successful
Let me analyze the error handling and security implications based on the provided results.
The verification shows:
-
Error handling is consistent across the file:
- All cryptographic operations use proper error wrapping with
map_err
- Invalid signature types use
bail!
consistently - Key parsing errors are properly propagated
- All cryptographic operations use proper error wrapping with
-
Error messages:
- Generic error messages like "Invalid signature type" don't leak sensitive information
- Signature verification errors include the underlying error but don't expose key material
- Key parsing errors use consistent messaging patterns
-
Base64 handling:
- Base64 implementation is centralized in
prism_serde
crate - Usage is consistent across the codebase
- Proper error handling for base64 operations
- Base64 implementation is centralized in
Let me verify one more aspect - the serialization implementation:
Based on the comprehensive verification results, I can now provide the final response:
Serialization implementation is secure and well-implemented
The verification confirms that:
CryptoPayload
serialization is consistently implemented across all cryptographic types- Key material is properly encapsulated during serialization/deserialization
- Error handling is robust and doesn't leak sensitive information
- Base64 encoding/decoding is centralized and properly validated
- Length checks are in place to prevent invalid key material
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Analyze error handling consistency and potential security issues
# Check for error handling patterns
echo "Analyzing error handling patterns..."
rg -A 2 "Err\(|bail!\(|anyhow!\(" crates/keys/src/verifying_keys.rs
# Check for potential information leakage in error messages
echo "Checking error messages..."
rg -A 1 "Invalid.*key|Failed.*signature" crates/keys/src/verifying_keys.rs
# Look for base64 usage across the codebase
echo "Analyzing base64 usage..."
rg "to_base64|from_base64" --type rust
Length of output: 3457
Script:
#!/bin/bash
# Check serialization implementation
echo "Checking CryptoPayload usage..."
rg -A 3 "CryptoPayload" --type rust
echo "Checking serialization traits..."
ast-grep --pattern 'impl $_ for VerifyingKey {
$$$
}'
Length of output: 13599
6e0f4a2
to
174a8ff
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Outside diff range and nitpick comments (7)
Cargo.toml (1)
50-50
: LGTM! Consider using the latest patch version.The addition of
serde_bytes
aligns well with the serialization overhaul. However, version 0.11.16 is available with some improvements.-serde_bytes = "0.11.15" +serde_bytes = "0.11.16"crates/serde/src/binary.rs (2)
4-15
: LGTM! Consider adding documentation for the trait split rationale.The separation of concerns between
ToBinary
andFromBinary
is a good design choice. Consider adding documentation explaining why this split was chosen over the previousBinaryTranscodable
trait.Add documentation like:
/// Trait for types that can be serialized to binary format. /// /// This trait is separated from `FromBinary` to allow types that can only be /// serialized but not deserialized, following Rust's convention of separate /// traits for different directions of conversion. pub trait ToBinary { // ... }
4-28
: Consider adding examples and derive macros.The traits would benefit from usage examples and potentially derive macros for common cases.
Add examples in documentation:
/// # Examples /// /// ``` /// use prism_serde::ToBinary; /// /// #[derive(serde::Serialize)] /// struct MyData { value: u32 } /// /// let data = MyData { value: 42 }; /// let bytes = data.encode_to_bytes()?; /// ```Consider adding derive macros in a separate module:
/// Derives ToBinary and FromBinary for structs and enums #[proc_macro_derive(Binary)] pub fn derive_binary(input: TokenStream) -> TokenStream { // Implementation }crates/storage/src/rocksdb.rs (3)
111-111
: Consider caching decoded versions for performanceWhile the implementation is correct, decoding versions from hex strings in a loop might impact performance when iterating through many versions. Consider caching the decoded versions or optimizing the version storage format.
let mut latest_value = None; let mut latest_version = 0; +let mut decoded_versions = Vec::new(); for item in iter { let (key, value) = item?; - let version = - Version::decode_from_bytes(&Vec::<u8>::from_hex(&key[value_key.len() + 1..])?)?; + let version = match decoded_versions.last() { + Some(v) if v.0 == key[value_key.len() + 1..] => *v.1, + _ => { + let v = Version::decode_from_bytes( + &Vec::<u8>::from_hex(&key[value_key.len() + 1..])? + )?; + decoded_versions.push((key[value_key.len() + 1..].to_vec(), v)); + v + } + };Also applies to: 119-120, 128-128
136-136
: Consider adding specific error contextThe implementation is correct, but consider adding more specific error context for debugging purposes.
- let node: Node = Node::decode_from_bytes(&value)?; + let node: Node = Node::decode_from_bytes(&value) + .map_err(|e| anyhow!("Failed to decode node: {}", e))?;Also applies to: 138-140
Line range hint
176-256
: Consider adding serialization-specific test casesWhile the existing tests cover the main functionality, consider adding specific test cases for:
- Invalid hex encoding
- Version key collisions
- Large value serialization
crates/serde/src/lib.rs (1)
50-81
: Consider adding documentation for the moduleThe implementation is solid and mirrors the raw_or_hex module nicely. Consider adding module-level documentation to explain when to use base64 vs hex serialization.
Add documentation like this:
+/// Provides serialization helpers for converting between binary and base64 formats. +/// Use this module when working with binary data that needs to be represented as base64 +/// in human-readable formats (e.g., JSON) while maintaining efficient binary representation +/// in non-human-readable formats. pub mod raw_or_b64 {
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
⛔ Files ignored due to path filters (1)
Cargo.lock
is excluded by!**/*.lock
📒 Files selected for processing (16)
Cargo.toml
(1 hunks)crates/common/Cargo.toml
(0 hunks)crates/common/src/digest.rs
(3 hunks)crates/da/Cargo.toml
(0 hunks)crates/da/src/celestia.rs
(2 hunks)crates/da/src/lib.rs
(3 hunks)crates/keys/src/lib.rs
(4 hunks)crates/keys/src/verifying_keys.rs
(5 hunks)crates/serde/Cargo.toml
(1 hunks)crates/serde/src/base64.rs
(1 hunks)crates/serde/src/binary.rs
(1 hunks)crates/serde/src/hex.rs
(1 hunks)crates/serde/src/lib.rs
(2 hunks)crates/storage/Cargo.toml
(0 hunks)crates/storage/src/redis.rs
(5 hunks)crates/storage/src/rocksdb.rs
(4 hunks)
💤 Files with no reviewable changes (3)
- crates/da/Cargo.toml
- crates/common/Cargo.toml
- crates/storage/Cargo.toml
🚧 Files skipped from review as they are similar to previous changes (7)
- crates/keys/src/lib.rs
- crates/da/src/lib.rs
- crates/da/src/celestia.rs
- crates/storage/src/redis.rs
- crates/common/src/digest.rs
- crates/serde/src/hex.rs
- crates/serde/src/base64.rs
🔇 Additional comments (10)
crates/serde/Cargo.toml (1)
15-15
: LGTM! Workspace inheritance is properly used.
The addition of serde_bytes
using workspace inheritance is correct and consistent with the main Cargo.toml.
crates/storage/src/rocksdb.rs (3)
11-14
: LGTM! Clean import structure for new serialization traits
The new imports from prism_serde are well-organized and properly scoped for the binary and hex serialization requirements.
155-156
: Verify atomic batch writes
The serialization changes look good. Ensure that RocksDB's atomic batch write guarantees are maintained with the new encoding format, especially for version keys.
Also applies to: 161-162, 165-165
✅ Verification successful
The previous search might have been too specific. Let's broaden it to catch more relevant patterns and understand the batch write implementation better.
Atomic batch writes are properly maintained
After reviewing the implementation, the RocksDB atomic batch writes are correctly handled with the new encoding format:
- All node and value operations are collected in a single
WriteBatch
- The batch is written atomically using
connection.write(batch)
- Key prefixes and encoding are consistent across the implementation
- Test cases verify the atomic behavior with multiple versions
The version key construction and batch operations maintain ACID properties through RocksDB's atomic batch write guarantees.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Search for other batch write implementations to verify consistency
rg -p "write_batch.*encode_to" --type rust
rg -p "batch\.put.*encode" --type rust
Length of output: 85
Script:
#!/bin/bash
# Let's try a broader search for batch operations and encoding
rg -p "batch.*put.*encode" --type rust -A 3
rg -p "write_batch.*node" --type rust -A 3
# Also search for the specific method implementation
ast-grep --pattern 'fn write_node_batch($$$) {
$$$
}'
Length of output: 148
Script:
#!/bin/bash
# Let's look at the entire RocksDB implementation to understand the batch operations
cat crates/storage/src/rocksdb.rs
# Also search for any tests related to batch operations
fd "test.*rs" --exec rg -p "write_node_batch|batch.*put" {} \;
Length of output: 8046
97-97
: Verify consistent key encoding across the codebase
The serialization changes look good, but we should verify that the key encoding format is consistent with other storage implementations.
Also applies to: 101-101
✅ Verification successful
Key encoding is consistent across storage implementations
The verification shows consistent key encoding patterns across different storage implementations:
- RocksDB uses
node_key.encode_to_bytes()?.to_hex()
withKEY_PREFIX_NODE
prefix - Redis uses the same encoding pattern with "node:" prefix
- Both implementations consistently use
decode_from_bytes
for deserialization
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Search for other implementations of NodeKey encoding to ensure consistency
ast-grep --pattern 'NodeKey::$_encode_to_bytes()'
rg -p "NodeKey.*encode_to" --type rust
Length of output: 92
Script:
#!/bin/bash
# Let's try a broader search for key encoding patterns
rg -p "encode_to_bytes.*hex" --type rust -A 2
rg -p "KEY_PREFIX_NODE" --type rust -A 2
# Also search for NodeKey usage patterns
rg -p "NodeKey" --type rust -A 2
# Look for similar storage implementations
fd "storage" --type f --extension rs --exec rg -p "key.*format.*hex" {} \;
Length of output: 4261
crates/serde/src/lib.rs (2)
1-3
: LGTM! Clean module organization
The module structure effectively separates different serialization concerns.
15-46
: LGTM! Well-structured serialization implementation
The generic constraints and error handling are well implemented. The separation between human-readable and binary formats is clean.
Let's verify that all types using this serialization implement the required traits:
✅ Verification successful
Let me verify if the Digest
type also implements ToHex
and the serde traits:
Let me check if the inner [u8; 32]
type automatically implements the required traits or if there are blanket implementations:
Let me check if the hex
crate provides the necessary trait implementation for byte arrays:
Implementation appears correct and safe
The verification shows that:
Digest
derivesSerialize
andDeserialize
Digest
explicitly implementsFromHex
ToHex
is implemented via a blanket implementation for any type that implementshex::ToHex
- The inner type
[u8; 32]
is a byte array which is supported by bothserde_bytes
andhex
crates by default
The implementation satisfies all the required trait bounds for the serialization module.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Search for types using raw_or_hex serialization
rg -l '#\[serde\(with = "raw_or_hex"\)\]' | while read -r file; do
echo "=== $file ==="
# Extract the type and check for trait implementations
rg -A 5 'struct|enum' "$file"
rg -A 2 'impl (ToHex|FromHex)' "$file"
done
Length of output: 480
Script:
#!/bin/bash
# Check for ToHex implementation
ast-grep --pattern 'impl ToHex for Digest'
# Check for serde traits
ast-grep --pattern 'impl BytesSerialize for Digest'
ast-grep --pattern 'impl BytesDeserialize for Digest'
# Check if it's derived
rg '#\[derive.*Serialize.*Deserialize' crates/common/src/digest.rs
Length of output: 292
Script:
#!/bin/bash
# Check for blanket implementations of ToHex and BytesSerialize/Deserialize
ast-grep --pattern 'impl<$_> ToHex for $_'
ast-grep --pattern 'impl<$_> BytesSerialize for $_'
# Check if these traits are re-exported from another module
rg "pub use .*(ToHex|BytesSerialize)" -A 2
Length of output: 471
Script:
#!/bin/bash
# Check Cargo.toml for hex dependency
rg "hex.*=" crates/common/Cargo.toml
# Check if [u8; N] implements hex::ToHex
ast-grep --pattern 'impl<$_> hex::ToHex for [$_]'
# Check if serde_bytes provides BytesSerialize for arrays
rg "serde_bytes.*=" crates/common/Cargo.toml
Length of output: 137
crates/keys/src/verifying_keys.rs (4)
20-23
: LGTM! Clean import organization
The imports are well-organized and properly scoped for the serialization functionality.
167-167
: LGTM! Correct usage of SECP256K1 context
The public key generation is properly implemented using the global SECP256K1 context.
224-235
: LGTM! Clean string conversion implementations
The TryFrom<String>
and Display
implementations are well-integrated with the new base64 serialization traits.
205-206
:
Correct the trait bound to T: AsRef<str>
in from_base64
method
The function expects a base64-encoded string but uses T: AsRef<[u8]>
. This should be T: AsRef<str>
for string input.
Apply this diff:
- fn from_base64<T: AsRef<[u8]>>(base64: T) -> Result<Self, Self::Error> {
+ fn from_base64<T: AsRef<str>>(base64: T) -> Result<Self, Self::Error> {
174a8ff
to
a052cd1
Compare
can base64 be removed from some cargo.toml? |
a052cd1
to
a3ba083
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can merge if ci passes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Outside diff range and nitpick comments (2)
crates/serde/src/lib.rs (1)
15-46
: LGTM! Consider adding documentation for the trait boundsThe implementation correctly handles both human-readable (hex) and binary formats with proper error handling. The trait bounds are well-chosen for the serialization requirements.
Consider adding documentation comments explaining the required trait bounds and their purpose:
/// Serializes a value that can be converted to both hex string and bytes. /// /// # Type Parameters /// * `T`: Must implement: /// - `ToHex` for hex string conversion /// - `BytesSerialize` for binary serialization pub fn serialize<S, T>(encodable: T, serializer: S) -> Result<S::Ok, S::Error>crates/keys/src/verifying_keys.rs (1)
233-236
: Consider using write_str for better performanceWhile the current implementation is correct, using
write_str
directly would avoid an intermediate String allocation.- let encoded = self.to_bytes().to_base64(); - write!(f, "{}", encoded) + write_str(f, &self.to_bytes().to_base64())
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
Cargo.lock
is excluded by!**/*.lock
📒 Files selected for processing (18)
Cargo.toml
(1 hunks)crates/common/Cargo.toml
(0 hunks)crates/common/src/digest.rs
(3 hunks)crates/common/src/transaction.rs
(2 hunks)crates/da/Cargo.toml
(0 hunks)crates/da/src/celestia.rs
(2 hunks)crates/da/src/lib.rs
(3 hunks)crates/keys/Cargo.toml
(0 hunks)crates/keys/src/lib.rs
(4 hunks)crates/keys/src/verifying_keys.rs
(5 hunks)crates/serde/Cargo.toml
(1 hunks)crates/serde/src/base64.rs
(1 hunks)crates/serde/src/binary.rs
(1 hunks)crates/serde/src/hex.rs
(1 hunks)crates/serde/src/lib.rs
(2 hunks)crates/storage/Cargo.toml
(0 hunks)crates/storage/src/redis.rs
(5 hunks)crates/storage/src/rocksdb.rs
(4 hunks)
💤 Files with no reviewable changes (4)
- crates/keys/Cargo.toml
- crates/storage/Cargo.toml
- crates/common/Cargo.toml
- crates/da/Cargo.toml
🚧 Files skipped from review as they are similar to previous changes (11)
- crates/serde/Cargo.toml
- Cargo.toml
- crates/keys/src/lib.rs
- crates/da/src/celestia.rs
- crates/common/src/transaction.rs
- crates/storage/src/redis.rs
- crates/serde/src/hex.rs
- crates/common/src/digest.rs
- crates/serde/src/base64.rs
- crates/storage/src/rocksdb.rs
- crates/da/src/lib.rs
🔇 Additional comments (7)
crates/serde/src/lib.rs (2)
1-3
: LGTM! Module structure is well-organized
The public modules for base64, binary, and hex provide a clear separation of concerns for different serialization formats.
50-81
: LGTM! Verify consistent usage of base64 encoding across the codebase
The implementation correctly mirrors the raw_or_hex module's pattern, providing consistent behavior for base64 encoding.
Let's verify the base64 encoding usage across the codebase:
✅ Verification successful
Let me gather more information about the base64 usage by searching specifically for the base64 module and its traits.
Let me try a different approach to search for base64 related code.
Base64 encoding implementation is consistent across the codebase
The base64 encoding/decoding implementation is well-structured and consistently used:
- The
base64
module incrates/serde
defines the core traits:ToBase64
for encoding bytes to base64 stringsFromBase64
for decoding base64 strings to bytes
- These traits are properly implemented for common types:
Vec<u8>
and[u8; 32]
have standard implementationsVerifyingKey
andDigest
implementFromBase64
consistently using the standard implementations
- The
raw_or_b64
module correctly uses these traits for serde serialization/deserialization
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Check for consistent base64 encoding usage
# Look for potential inconsistencies in base64 encoding methods
# Search for base64 related code
rg -l 'base64|Base64' | while read -r file; do
echo "=== $file ==="
# Show base64 related code with context
rg -A 3 -B 3 'base64|Base64' "$file"
done
Length of output: 68811
Script:
#!/bin/bash
# Search for base64 module implementation and trait definitions
ast-grep --pattern $'trait ToBase64 {
$$$
}'
ast-grep --pattern $'trait FromBase64 {
$$$
}'
# Look for base64 module content
rg -A 10 -B 10 'pub mod base64' crates/serde/src/base64.rs
Length of output: 158
Script:
#!/bin/bash
# Search for base64 module content and trait implementations
rg -p 'trait (From|To)Base64|impl.*Base64|pub mod base64' --type rust
# Look for base64 usage patterns
rg -p 'to_base64|from_base64|Base64' --type rust -C 3
Length of output: 5366
crates/keys/src/verifying_keys.rs (2)
20-23
: LGTM! Clean import organization
The imports are well-organized and properly scoped for the new serialization functionality.
205-206
:
Fix trait bound for base64 input
The trait bound T: AsRef<[u8]>
is not optimal for base64-encoded string input.
As noted in a previous review, change the trait bound to T: AsRef<str>
to better handle string input:
- fn from_base64<T: AsRef<[u8]>>(base64: T) -> Result<Self, Self::Error> {
+ fn from_base64<T: AsRef<str>>(base64: T) -> Result<Self, Self::Error> {
crates/serde/src/binary.rs (3)
4-19
: LGTM!
The ToBinary
trait and its implementation are well-defined and correctly utilize bincode::serialize
for serialization.
21-25
: LGTM!
The FromBinary
trait is properly defined with an associated Error
type and a method for deserialization.
32-33
: Consider adding size limits for deserialization to enhance security
Deserializing untrusted input without size limits can lead to Denial-of-Service vulnerabilities if maliciously large inputs are provided.
Apply this diff to set a size limit:
fn decode_from_bytes<B: AsRef<[u8]>>(bytes: B) -> Result<Self, Self::Error> {
+ let config = bincode::config::DefaultOptions::new()
+ .with_limit(1024 * 1024); // 1MB limit, adjust as needed
+ config.deserialize(bytes.as_ref())
- bincode::deserialize(bytes.as_ref())
}
This change uses bincode
's configuration to impose a size limit during deserialization.
a3ba083
to
9c9f42c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Outside diff range and nitpick comments (2)
crates/serde/src/lib.rs (2)
34-45
: Consider enhancing error handling specificityWhile the error handling is functional, consider providing more specific error context when mapping from the hex parsing error to the serde error.
- T::from_hex(hex_str).map_err(serde::de::Error::custom) + T::from_hex(hex_str).map_err(|e| serde::de::Error::custom( + format!("Failed to parse hex string: {}", e) + ))
69-80
: Consider enhancing error handling specificityFor consistency with the suggested hex module improvement, consider enhancing the error context here as well.
- T::from_base64(base64_string).map_err(serde::de::Error::custom) + T::from_base64(base64_string).map_err(|e| serde::de::Error::custom( + format!("Failed to parse base64 string: {}", e) + ))
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
Cargo.lock
is excluded by!**/*.lock
📒 Files selected for processing (21)
Cargo.toml
(1 hunks)crates/common/Cargo.toml
(0 hunks)crates/common/src/digest.rs
(3 hunks)crates/common/src/hashchain.rs
(1 hunks)crates/common/src/transaction.rs
(2 hunks)crates/common/src/tree/proofs.rs
(1 hunks)crates/common/src/tree/snarkable_tree.rs
(1 hunks)crates/da/Cargo.toml
(0 hunks)crates/da/src/celestia.rs
(2 hunks)crates/da/src/lib.rs
(3 hunks)crates/keys/Cargo.toml
(0 hunks)crates/keys/src/lib.rs
(4 hunks)crates/keys/src/verifying_keys.rs
(5 hunks)crates/serde/Cargo.toml
(1 hunks)crates/serde/src/base64.rs
(1 hunks)crates/serde/src/binary.rs
(1 hunks)crates/serde/src/hex.rs
(1 hunks)crates/serde/src/lib.rs
(2 hunks)crates/storage/Cargo.toml
(0 hunks)crates/storage/src/redis.rs
(5 hunks)crates/storage/src/rocksdb.rs
(4 hunks)
💤 Files with no reviewable changes (4)
- crates/keys/Cargo.toml
- crates/storage/Cargo.toml
- crates/da/Cargo.toml
- crates/common/Cargo.toml
🚧 Files skipped from review as they are similar to previous changes (15)
- crates/serde/Cargo.toml
- Cargo.toml
- crates/common/src/tree/proofs.rs
- crates/keys/src/lib.rs
- crates/da/src/lib.rs
- crates/common/src/hashchain.rs
- crates/common/src/transaction.rs
- crates/da/src/celestia.rs
- crates/storage/src/redis.rs
- crates/storage/src/rocksdb.rs
- crates/serde/src/binary.rs
- crates/common/src/digest.rs
- crates/common/src/tree/snarkable_tree.rs
- crates/serde/src/hex.rs
- crates/serde/src/base64.rs
🔇 Additional comments (11)
crates/keys/src/verifying_keys.rs (7)
20-23
: LGTM: Clean dependency migration
The change successfully migrates base64 functionality to use the internal prism_serde
crate, aligning with the PR objective to remove the external base64 dependency.
167-167
: LGTM: Clean code style improvement
The simplified expression maintains the same functionality while improving readability.
Line range hint 187-220
: LGTM: Well-documented implementation with proper error handling
The implementation correctly handles different key lengths and provides comprehensive documentation about the expected input formats.
224-230
: LGTM: Clean implementation following SRP
The implementation correctly delegates to from_base64
, maintaining code reusability.
234-235
: LGTM: Consistent use of new base64 encoding
The implementation correctly uses the new to_base64
trait method, maintaining consistency with the serialization changes.
205-206
:
Correct the trait bound to T: AsRef<str>
in from_base64
method
The trait bound issue from the previous review still needs to be addressed.
- fn from_base64<T: AsRef<[u8]>>(base64: T) -> Result<Self, Self::Error> {
+ fn from_base64<T: AsRef<str>>(base64: T) -> Result<Self, Self::Error> {
Line range hint 187-235
: Verify compatibility with existing serialized data
The changes to base64 encoding/decoding could potentially impact existing stored or transmitted data. Let's verify the compatibility:
crates/serde/src/lib.rs (4)
Line range hint 6-11
: LGTM! Well-structured struct definition
The struct is properly defined with appropriate serde attributes for serialization control.
21-31
: LGTM! Well-implemented generic serialization
The serialize function is well-designed with:
- Proper generic constraints using ToHex + BytesSerialize
- Clear handling of human-readable vs binary formats
- Efficient implementation
56-66
: LGTM! Consistent implementation with raw_or_hex
The serialize function maintains consistency with the hex implementation while properly handling base64 encoding.
1-3
: Verify base64 module necessity
Based on the PR comments about removing the base64 dependency, let's verify if this module is still required.
Summary by CodeRabbit
New Features
Digest
struct, including new instance creation options from hexadecimal and base64 formats.CelestiaConnection
struct with improved error handling and updated decoding processes.FinalizedEpoch
struct to streamline signature handling.Bug Fixes
Chores
serde_bytes
to enhance serialization capabilities.