PDS: Adding new leaf to MST is O(n²) #22

rudyfraser · 2024-10-12T13:54:37Z

See graph for sample size of 254 leafs being added.

Discovered this issue when testing adding records in bulk (f6b1b26). Should be able to add 1000 records in ms. Canonical code for the same test runs in 0.717 s, estimated 2 s

The text was updated successfully, but these errors were encountered:

DavidBuchanan314 · 2024-10-12T15:46:35Z

So just to clarify, adding n nodes is costing O(n²) overall, meaning each individual add is O(n)?

Versus the ideal expectation of O(n log n) overall, and O(log n) for each add?

DavidBuchanan314 · 2024-10-12T15:51:46Z

I think this is your issue right here:

rsky/rsky-pds/src/repo/mst/mod.rs

Line 584 in f6b1b26

    
           let mut new_root = MST::create(self.storage.clone(), Some(updated), Some(key_zeros))?;

I'm not a rustacean so forgive me if I'm misunderstanding, but if you're copying the whole MST storage then that's an O(n) operation (where n is the number of nodes currently in the tree)

Edit: ah yeah I'm probably mistaken, I see storage is a SqlRepoReader so that line doesn't actually copy the data

rudyfraser · 2024-10-12T16:31:04Z

So just to clarify, adding n nodes is costing O(n²) overall, meaning each individual add is O(n)?

Versus the ideal expectation of O(n log n) overall, and O(log n) for each add?

Yes, I think that is exactly right. I appreciate you taking a look at this!

mackuba · 2024-10-13T22:30:56Z

cc @steveklabnik

steveklabnik · 2024-10-28T21:31:18Z

Sorry for the late response here; I've glanced at this a few times, and nothing really sticks out to me. I don't have time this exact minute to try and dig in even further, but rather than look it over and go "nope, not immediately seeing it" and not leaving a comment like the last two or three times, figured I'd say something at least :)

DavidBuchanan314 · 2024-11-17T02:57:44Z

I just came across this https://github.com/domodwyer/merkle-search-tree - which is a rust MST implementation with very impressive perf numbers. It doesn't look like it was written with atproto in mind though, so I'm not sure it'd be easy to drop in, but it might be useful for reference

afbase · 2024-12-16T01:11:15Z

@rudyfraser are you still working on this?

Are you happy with the implementation of the MST - at the time of asking this, c948af2?

afbase · 2024-12-16T01:12:03Z

@rudyfraser are you still working on this?

Are you happy with the implementation of the MST - at the time of asking this, c948af2?

I'm happy to take a look if you'd like. any guidance on what you'd like to see would be appreciated

rudyfraser · 2024-12-16T05:09:36Z

@afbase still need help with this. Ultimately I would expect for unit tests in rsky-pds::repo::mst::tests to perform similarly to the canonical TS implementation. Currently they take significantly longer to run.

High level outcome would be to mirror the unit tests in https://github.com/bluesky-social/atproto/blob/main/packages/repo/tests/mst.test.ts (especially ones like "adds records") and pass the expected outcomes with similar performance.

afbase · 2024-12-17T06:00:39Z

@rudyfraser

Testing, Benchmarking, (and Fuzzing)

I'll prioritize this first as I can work with the implementation as-is.

Canonical Implementation

from the README.md:

Implementations should follow closely to the canonical Typescript implementation

rsky's MST storage is a SqlRepoReader whereas the typescript's MST storage is a ReadableBlockStore. Is there a desire to implement a ReadableBlockStore?

Cloning, Reference Counting, or Neither

For when we need to insert a Cid in a higher layer of the MST, and reset the root, my thoughts are that we might want to have some kind of Arc<SqlRepoReader> or Arc<RwLock<SqlRepoReader>>. This would involve a great deal of refactoring across rsky.

The other merkle search tree implementation

However, when referencing the domodwyer/merkle-search-tree, I don't see any reference counting, or locking on the Page (i.e. layer). There is one caveat: a Node (NodeEntry) can dereference a Page (layer) that is "less than"/"lower". The implementation upsert (add) starts from the root and then traverses down to the corresponding Page (layer) whereas the typescript and rsky implementation - i believe (please correct me if i am wrong) - it starts from the bottom bottom and traverses upward.

I'm hesitant to look into the other MST too much further as that MST and the ATProto implementation might be too far apart in implementation to draw ideas from meaningfully.

afbase · 2024-12-24T04:51:51Z

@rudyfraser

I made a branch on my fork that does some benchmarking of the add at sizes 100, 500, and 1000. It uses Criterion and cargo bench. I've copied the two benchmark reports i ran into their respective commit's hash name.

here is a screenshot of the report summary on second benchmark's 1000 keys sample size:

you can take a look at the others by opening up those index.html files in the benches folder.

also see my notes on the metric i made up. I think it is useful to give some idea of the sample's "tree depth" which depending on insert order, can make that let mut new_root = MST::create(self.storage.clone(), Some(updated), Some(key_zeros))?; line get triggered IIUC.

I'm not super familiar with Typescript and so the canonical implementations add tests to me are not exactly clear to me, but I could dig into them more. Granted, this benchmark is very rustic (i don't know what the equivalent is for "pythonic" for rust) and doesn't conform to the canonical typscript tests nor am I sure if we want to mirror that as that sounds like creating an entire test library for canonical'ness.

Some Questions

It would take a little bit to make the benchmarks run in as a github action. That would also eat up a lot of compute time so I'm not sure if that is desirable. Perhaps just documentation on running benchmarks only locally on a developer's machine is best. What do you think?
I'll see what else I can to with the tests from https://github.com/bluesky-social/atproto/blob/main/packages/repo/tests/mst.test.ts - are there particular add tests like the "MST Interop Edge Cases" that you're interested in seeing in the test module?

erlend-sh · 2024-12-24T08:14:48Z

millipds by @DavidBuchanan314 might be far enough along now that there’s prior art to draw from there as well?

afbase · 2025-01-04T00:59:40Z

I've been playing around with tweaking the add function to see if my benchmarks improved any. They got worse 🙃

From the README.md:

Implementations should follow closely to the canonical Typescript implementation

I'd like tweak the MST struct and a couple of other things to make things more reference count friendly that could work in place of cloning. This might drift the implementation a bit from the canonical Typescript implementation. but my aim would be to have the unit tests work.

rudyfraser · 2025-01-04T02:31:20Z

Documentation of the benchmark process would be fine but personally think it's lower priority at the moment.

Most of the unit tests already exist here https://github.com/blacksky-algorithms/rsky/blob/main/rsky-pds/src/repo/mst/mod.rs#L1274

For this case moving away from the canonical implementation would be fine as long as changes are directly related to the mst, existing tests pass, and there are improvements to performance.

I have more availability now and will try to revisit this as well

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PDS: Adding new leaf to MST is O(n²) #22

PDS: Adding new leaf to MST is O(n²) #22

rudyfraser commented Oct 12, 2024

DavidBuchanan314 commented Oct 12, 2024 •

edited

Loading

DavidBuchanan314 commented Oct 12, 2024 •

edited

Loading

rudyfraser commented Oct 12, 2024

mackuba commented Oct 13, 2024

steveklabnik commented Oct 28, 2024

DavidBuchanan314 commented Nov 17, 2024

afbase commented Dec 16, 2024

afbase commented Dec 16, 2024

rudyfraser commented Dec 16, 2024

afbase commented Dec 17, 2024

afbase commented Dec 24, 2024

erlend-sh commented Dec 24, 2024

afbase commented Jan 4, 2025

rudyfraser commented Jan 4, 2025

PDS: Adding new leaf to MST is O(n²) #22

PDS: Adding new leaf to MST is O(n²) #22

Comments

rudyfraser commented Oct 12, 2024

DavidBuchanan314 commented Oct 12, 2024 • edited Loading

DavidBuchanan314 commented Oct 12, 2024 • edited Loading

rudyfraser commented Oct 12, 2024

mackuba commented Oct 13, 2024

steveklabnik commented Oct 28, 2024

DavidBuchanan314 commented Nov 17, 2024

afbase commented Dec 16, 2024

afbase commented Dec 16, 2024

rudyfraser commented Dec 16, 2024

afbase commented Dec 17, 2024

Testing, Benchmarking, (and Fuzzing)

Canonical Implementation

Cloning, Reference Counting, or Neither

The other merkle search tree implementation

afbase commented Dec 24, 2024

Some Questions

erlend-sh commented Dec 24, 2024

afbase commented Jan 4, 2025

rudyfraser commented Jan 4, 2025

DavidBuchanan314 commented Oct 12, 2024 •

edited

Loading

DavidBuchanan314 commented Oct 12, 2024 •

edited

Loading