-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Replace recursive tree traversal #58
Conversation
let digest: [u8; 32] = value.hash().digest().try_into()?; | ||
Ok(Self(digest)) | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
.. just some code shuffling in the tests
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First of all: AWESOME!!!
Now, some remarks:
-
I think it is not clear whether it is worth it to have the iterator do double duty as forward and backward iterator. There are so many mode switches that it might be nicer to just have a bit more duplicate code and have a separate forward and backward iterator. Maybe the code can be DRYed in another way.
-
what would be awesome would be if you could combine the two stacks into one, and use a struct with named fields for that. It seems that one stack is "lagging" the other one, but other than that they seem to be used in the same way. The depth is going to be the same at all times +-1. So a single stack might make the code more readable...
@rklaehn I combined both stacks into one, and created a named struct to use it with: |
banyan/src/forest/read.rs
Outdated
Mode::Backward => *pos == usize::MAX, | ||
}; | ||
if new_branch { | ||
if head.filter.is_none() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Much nicer this way, I think.
banyan/src/forest/read.rs
Outdated
@@ -162,44 +183,39 @@ where | |||
self.query | |||
.intersecting(start_offset, index, &mut q_matching); | |||
debug_assert_eq!(branch.children.len(), q_matching.len()); | |||
let _ = std::mem::replace(matching, q_matching); | |||
head.filter.replace(q_matching); | |||
|
|||
if matches!(self.mode, Mode::Backward) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be clearer here to set the position in both cases, instead of using the default if Mode is Forward.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks very good. Will probably approve. Just checking this out to look at it some more.
I think this would benefit from a test that builds a super-degenerate tree aka linked list with 10000 iitems and then tries to traverse it. I think we should be able to make this reliably fail with the previous impl. I will write it. |
when reading this, I thought - oh shit, there is a nested loop. But actually there is just one.
I think there is a bug with the range for the placeholder. Also, I have experimented with how to simplify the stack frame. See #59 |
banyan/src/forest/read.rs
Outdated
index: Arc<Index<T>>, | ||
// If `index` points to a branch node, `position` points to the currently | ||
// traversed child | ||
position: usize, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is probably worth noting that this is not the child offset, but the child offset in case of forward traversal, and the child offset + 1 in case of backward traversal.
I guess you have already tried to make this an isize
and have it be exactly the position?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried this out here: 10bfa3d
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool! Let's get it in!
I haven't tried that. At some point I kind of lost sight ..
Here is an experiment, making position an isize so it can refer directly to the child: |
Possible traversal improvements
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I might still try to DRY this, especially the forward/backwards branches. But I think it reads quite well now. Great work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, quite readable, nice and tidy! Just a few nitpicks.
type Item = Result<FilteredChunk<T, V, E>>; | ||
|
||
fn next(&mut self) -> Option<Self::Item> { | ||
let res: FilteredChunk<T, V, E> = loop { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
given that I’ll find a break x
somewhere in here, maybe just return Some(Ok(x))
instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, that's a matter of preference I guess. I like the break approach better. Less boilerplate in the loop.
continue; | ||
} | ||
|
||
match self.forest.load_node(&head.index) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
side remark: it feels weird to apply a “load node” function to an index data structure when in my head the index is part of said node — maybe I got this wrong
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no, it is indeed a bit weird. All nodes in banyan are "split in half". It is not very object oriented, since data that you would logically group together is separate so it is ordered by probability / frequency of access.
extra: (self.mk_extra)(index.as_index_ref()), | ||
}; | ||
|
||
break placeholder; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is it necessary to emit these?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the placeholders?
If you skip a subtree because either the filter does not match, or the subtree has been purged, you still want a placeholder to indicate that you have skipped offsets x..x+n. There is also the mk_extra fn that can extract some info from the tree node, e.g. the last lamport.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that iter_filtered_chunked is a low level API that can be used in multiple ways. If you want just the data, you can just flatten them away.
// tree by using an appropriate mk_extra fn, or check | ||
// `data.len()`. | ||
Ok(_) => { | ||
let TraverseState { index, .. } = self.stack.pop().expect("not empty"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This assumes that the tree always has at least one branch, correct? With time-based expiry I could imagine a tree that is pruned completely empty.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A tree where all events have been pruned will be branch containing a summary of the pruned data, including the offset range. You can never go from a non-empty tree to an empty tree.
The initial empty tree is peeled off with the struct Tree, which is an option of a non empty tree.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But - it would be nicer to more explicitly spell out the invariants. So might be worth trying out https://crates.io/crates/contracts for banyan - if there is a way to switch the stuff off without switching to release mode...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Made an issue: #60
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In other words: If the root node is a PrunedBranch
, we will end up in this case with Ok(PrunedBranch(index))
and everything is fine. This code won't panic.
Closes https://github.com/Actyx/Cosmos/issues/5824
Closes https://github.com/Actyx/Cosmos/issues/6153
next step is to filter_chunked_reverse ..