-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to verify the digest of the whole file when digests of file parts are known #18
Comments
I think this is connected to how abao can verify slices of data. https://github.com/n0-computer/abao But #17 (comment) |
kinda. A BLAKE3 digest is the root of a Merkle tree, but the root hash and the interior node hashes use different flags. (This was an intentional design decision.) So if you have two files A and B, there's no way to get As pcfreak said, you probably want to use Bao for this. The Bao encoding contains the interior node hashes, which lets you reconstruct the Merkle root. |
@lukechampine, thank you for your response! Unfortunately, I cannot use Bao since Bao requires scanning the whole file in sequence, and I need to avoid it (the parts could be processed out of sequence, on different machines, etc.) I can, however, get some other aspect of an internal state of Blake3 for each part separately. I wonder if the complete blake3 digest can be constructed from just the chaining values of each part? |
Ah. Well, as long as each part is aligned to 1024-byte boundaries of the original file, it should be possible to compute the chaining value(s) of each part and combine them into the total root. This package doesn't provide a direct way to do that, but you could achieve this by calling If I were you, I think I would fork this package and add a |
@lukechampine, thank you, this is very helpful! I'll try these suggestions. |
@lukechampine, I managed to get it to work, I also had to make the "counter" work right for parts other than the first one. Would you be interested in a PR that implements a wrapper around Hasher that minimally exposes some internal state and functionality of the Hasher, similar to what the Rust crate has in their "guts" module? They do it for a very similar purpose. |
no promises, but if you open a PR I'll definitely take a look! |
@haimgel fyi, I've added a |
Thank you so much for this! |
From my reading of Blake3 paper, and my understanding of Merkle trees in general, I think this should be doable in theory: if I have Blake3 digests of parts of a file, I should be able to calculate the digest of the whole (by building the tree of digests and then getting the root). But I don't think this module's API exposes enough to make it work, or am I missing something?
The text was updated successfully, but these errors were encountered: