-
Notifications
You must be signed in to change notification settings - Fork 590
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(storage): replace hummock protobuf strcut with rust struct #15386
Conversation
…nto li0k/storage_pb_to_struct
…nto li0k/storage_pb_to_struct
pub struct SstableInfo { | ||
pub object_id: u64, | ||
pub sst_id: u64, | ||
pub key_range: Option<KeyRange>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can use FullKeyRange
instead rather than the KeyRange
. The field is_right_inclusive
is very implicit. Maybe we can do this in a future PR.
src/storage/hummock_sdk/src/lib.rs
Outdated
fn estimated_encode_len(&self) -> usize; | ||
} | ||
|
||
pub trait ProtoSerializeExt { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the difference between these two traits? May add some comments to describe the difference.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If all appearance of T
is Self
, we don't need the T
, and we can use Self
directly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PTAL
…nto li0k/storage_pb_to_struct
src/storage/hummock_sdk/src/lib.rs
Outdated
// Similar to `ProtoSerializeExt``, we allow `ProtoSerializeOwnExt to provide a method to obtain ownership. Therefore, copying behavior in memory can be reduced | ||
pub trait ProtoSerializeOwnExt { | ||
type PB: Clone + PartialEq + ::prost::Message; | ||
|
||
fn from_protobuf_own(pb: Self::PB) -> Self; | ||
|
||
fn to_protobuf_own(self) -> Self::PB; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// Similar to `ProtoSerializeExt``, we allow `ProtoSerializeOwnExt to provide a method to obtain ownership. Therefore, copying behavior in memory can be reduced | |
pub trait ProtoSerializeOwnExt { | |
type PB: Clone + PartialEq + ::prost::Message; | |
fn from_protobuf_own(pb: Self::PB) -> Self; | |
fn to_protobuf_own(self) -> Self::PB; | |
} | |
// Similar to `ProtoSerializeExt``, we allow `ProtoSerializeOwnExt to provide a method to obtain ownership. Therefore, copying behavior in memory can be reduced | |
pub trait OwnedProtoSerialize { | |
type PB: Clone + PartialEq + ::prost::Message; | |
fn from_owned_protobuf(pb: Self::PB) -> Self; | |
fn to_owned_protobuf(self) -> Self::PB; | |
} |
src/storage/hummock_sdk/src/lib.rs
Outdated
@@ -355,3 +355,25 @@ impl EpochWithGap { | |||
self.0 & EPOCH_SPILL_TIME_MASK | |||
} | |||
} | |||
|
|||
pub trait ProtoSerializeSizeEstimatedExt { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are the traits all named with the suffix 'Ext'? Should they give default implementation and implements for all types that applies the trait without Ext
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, I think defining a new trait is unnecessary, unless we have some common logic that takes the trait abstraction as input, but it seems that there is no such common logic. It's more like enforcing a method name convention.
As discussed offline with @Li0k , it is not necessary to extract a trait for there is nowhere to use it. The structs should impl the impl From<PbVnodeWatermark> for VnodeWatermark { ... }
impl From<&PbVnodeWatermark> for VnodeWatermark { ... }
impl From<VnodeWatermark> for PbVnodeWatermark { ... }
impl From<&VnodeWatermark> for PbVnodeWatermark { ... } If there is a need to use a trait, the trait can be defined with trait ProtoSerialize: From<&PbVnodeWatermark> + From<&VnodeWatermark> {} And automatically impl for all types that satisfy the bound. e.g. impl<T: From<&PbVnodeWatermark> + From<&VnodeWatermark>> ProtoSerialize for T {} |
The downside of Say that we'll always convert It would be nice if we can directly write |
Do we need to impl |
…nto li0k/storage_pb_to_struct
…nto li0k/storage_pb_to_struct
@@ -326,7 +325,7 @@ impl NonOverlapSubLevelPicker { | |||
level_ssts.sort_by(|sst1, sst2| { | |||
let a = sst1.key_range.as_ref().unwrap(); | |||
let b = sst2.key_range.as_ref().unwrap(); | |||
a.compare(b) | |||
a.cmp(b) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
May use sort_by_key
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Due to the design of interface, we should clone the key_range if we use sort_by_key
. Even though copying the key_range after the PR is lightweight, I wanted to keep the clone from being triggered as much as possible.
src/storage/hummock_sdk/src/compaction_group/hummock_version_ext.rs
Outdated
Show resolved
Hide resolved
src/storage/hummock_sdk/src/lib.rs
Outdated
@@ -355,3 +355,25 @@ impl EpochWithGap { | |||
self.0 & EPOCH_SPILL_TIME_MASK | |||
} | |||
} | |||
|
|||
pub trait ProtoSerializeSizeEstimatedExt { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, I think defining a new trait is unnecessary, unless we have some common logic that takes the trait abstraction as input, but it seems that there is no such common logic. It's more like enforcing a method name convention.
pub uncompressed_file_size: u64, | ||
} | ||
|
||
impl From<&PbOverlappingLevel> for OverlappingLevel { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's unnecessary to implement a trait for the conversion, because we don't have any code that depends on the From
or Into
trait. The only benefit is that when we implement From
, we can call .into()
, but the benefit is limited. The code will be the same if we don't implement the trait and just implement a method for the new struct like
impl OverlappingLevel { // No impl `From<&PbOverlappingLevel>`
fn from(pb_overlapping_level: &PbOverlappingLevel) -> Self {
Self {
...
}
}
}
And then we'd better rename the method name from from
, which is too general, to from_protobuf
and to_protobuf
so that we can explicitly know that we are doing a conversion between new struct and pb when we read the code.
ditto for all other impl From
in this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not biased. @hzxa21
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In some cases, I want to consume the incoming arguments directly (instead of copying them), and if we want to fulfil this requirement, we still need to implement four functions
- from_protobuf(struct)
- from_protobuf(struct_ref)
- to_protobuf1(struct)
- to_protobuf2(struct_ref)
I don't think there's much difference except in naming, as in commit, and I've provided from_protobuf
and to_protobuf
wrappers for some common structures
…nto li0k/storage_pb_to_struct
|
GitGuardian id | GitGuardian status | Secret | Commit | Filename | |
---|---|---|---|---|---|
9425213 | Triggered | Generic Password | ebcc00b | ci/scripts/e2e-sink-test.sh | View secret |
🛠 Guidelines to remediate hardcoded secrets
- Understand the implications of revoking this secret by investigating where it is used in your code.
- Replace and store your secret safely. Learn here the best practices.
- Revoke and rotate this secret.
- If possible, rewrite git history. Rewriting git history is not a trivial act. You might completely break other contributing developers' workflow and you risk accidentally deleting legitimate data.
To avoid such incidents in the future consider
- following these best practices for managing and storing secrets including API keys and other credentials
- install secret detection on pre-commit to catch secret before it leaves your machine and ease remediation.
🦉 GitGuardian detects secrets in your source code to help developers and security teams secure the modern development process. You are seeing this because you or someone else with access to this repository has authorized GitGuardian to scan your pull request.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rest LGTM
src/storage/hummock_sdk/src/compaction_group/hummock_version_ext.rs
Outdated
Show resolved
Hide resolved
src/storage/hummock_sdk/src/compaction_group/hummock_version_ext.rs
Outdated
Show resolved
Hide resolved
} | ||
|
||
impl CompactTask { | ||
pub fn task_type(&self) -> PbTaskType { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ditto for these methods.
Besides, I think we should not provide method to get the PbXxx
type from the CompactTask
. We'd better just return the the Xxx
defined by ourselves.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will change it in other PR, remove the get
function first
…nto li0k/storage_pb_to_struct
…nto li0k/storage_pb_to_struct
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rest LGTM! Thanks for the great work!
src/storage/hummock_sdk/src/compaction_group/hummock_version_ext.rs
Outdated
Show resolved
Hide resolved
.sum::<usize>() | ||
+ size_of::<u32>() | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we split this file and move the new structs to separate files to avoid making this file too large?
…nto li0k/storage_pb_to_struct
…nto li0k/storage_pb_to_struct
I hereby agree to the terms of the RisingWave Labs, Inc. Contributor License Agreement.
What's changed and what's your intention?
as title
ext
Checklist
./risedev check
(or alias,./risedev c
)Documentation
Release note
If this PR includes changes that directly affect users or other significant modifications relevant to the community, kindly draft a release note to provide a concise summary of these changes. Please prioritize highlighting the impact these changes will have on users.