-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Benchmark analysis #108
Comments
I tested this idea, running a I have an idea that is basically just pass down a boolean This way we let the query developer choose wisely when to do decryption or not. The default (is it cc @arj03 |
We can do a flag, but this is a 1 time cost and its shared between the indexes in ssb-db2 and jitdb so in the overall picture I don't think it will matter that much. |
Also will respond more in full tomorrow :) |
Sounds like a good idea. It would still need a log.drain.
If it will help manyverse I'll be happy to accept such a PR, but I have been testing ssb-browser and I wouldn't use it there. |
Hmm yeah, would be interesting to test. The main thing I have left for the port of ssb-browser to db2 is that when I do an initial sync the browser just spins for quite a while afterwards. Outside the browser things are fine, I still need to see what is going on but I'll keep this in mind. Thanks for mentioned it. |
@arj03 Alright, I thought this through after our video call, and here's what I figured out: Yes it's true that heavyweight decrypt is run only once, and for subsequent queries it should be much lighter. But from the perspective of the user, it would be good if some queries take priority over others, and the results are shown ASAP while less-priority queries are processed in the background. And all this is about initial indexing, which is our mission to speed up, obviously even ssb-db1 performs fine after initial indexing. ConcurrencySo for a query like "give me all votes for this post", the user doesn't need care about private messages, so any work that is not absolutely necessary should not be blocking the query in question. Because we put I did a quick check, and if I turn off (1) or (2), then the benchmark looks like this:
I'm not sure what's the best solution for that, but two ideas I have are: Priority systemWe could assign some queries to have different priority numbers, and then schedule those that have higher number to run first. This could be too complex to build, specially when there exists workers. Threads (workers)I am not sure this would 100% work, because it could be our workloads are I/O bound, but because unboxing is CPU heavy, it could help to run the |
Would be interesting to check the threads idea. If you want to persue the decrypt idea it would be natural to do that in the isPublic operator. As long as the implementation is straight forward I don't see a problem adding that. |
I don't think I'll try the "decrypt only if the boolean opt is true" idea, not yet. I'll do some experiments with the threads idea, but I'll have to make a Manyverse release today and tomorrow. |
I ran this benchmark again (just for my own sake, to keep these notes later for study), but this time on my production mobile Manyverse and building all sorts of indexes (see list below). The durations indicate "time until first content displays".
|
By the way, to analyze memory use, it's useful to run the Chrome inspector for Node.js. There, I used a "heap snapshot" that showed all strings and objects and buffers allocated. Turns out there's approx. 330 MB of TypedArrays, and most of those are cached blocks in aligned-block-file or async-AOL. Makes sense. |
We could try fiddling with this. Have some good benchmarks now so should be easier to see. |
You kind of nerd-sniped me (by the way, I work on these alternative things while Manyverse is compiling, or the app is indexing) "1024 / 1000" means async-AOL with cache size 1024 and aligned-block-file with cache size 1000
|
When you remove everything so the old log is just used for streaming, then disabling the cache in aligned-block-file will make a ton of sense. It be both faster and use less memory. |
I'm trying to measure disk I/O performance on desktop and on mobile, to compare, and I'm not sure if I did this correctly because Android and Linux Desktop don't have the same tools, but I did: On desktop On Android I used an app called Disk Speed / Performance Test and got Read 189 MB/s and Write 34 MB/s. Not sure if I can trust these numbers, but it seems like mobile is 5.5x slower than desktop. If it takes ~3 sec for desktop to scan the log, it should take 17 sec for mobile to scan it. |
DB2 Benchmarks running in ManyverseNothing else running. Total msg count: 1015957
Mobile seems on average 4.07x slower than GitHub actions server. So that's quite realistically close to the "5.5x slower" disc I/O. |
On mobile:
Note a lot, but almost 1 second. I can imagine that not loading the whole thing might speed up startup. |
Hmm yeah, might be one of those things that are hard to benchmark on a computer compared to a phone. Interested to hear how it is without descriptions. |
Found out that loading the
3300ms in total, 1st part is loading the atomically-universal, 2nd part is the JSON.parse. What would the performance look like if we did it in leveldb? Perhaps load time would be smaller, and query time would be larger? |
Hmm. That is a good data point, I'll have to check that to see. |
Maybe we should pair on it tomorrow? |
Yes! |
Well, I have a PR that adds a leveldb index for that and it seems it speeds up things rather a lot. PR incoming, it requires a jitdb change first, though. |
I wonder if it will be one of those: fs faster in browser, level faster in node ;-) |
I found another thing that maybe indicates some opportunity to optimize. This is what manyverse mobile startup looks like: It takes about 5s in total.
What's funny is that the private posts query is quick but the public posts query is slow:
Same story on desktop too:
@arj03 Any idea what may be causing this? |
@arj03 Just for your information, Chrome CPU profiler also annotates the source code with the time it spent in each line of code. This is from manyverse desktop, picturing the new ssb-friends: |
I'll have to try the private query to see. That reminds me, I don't think we have a benchmark of that. |
Note, it's the public one that's slow. It's even weirder to me, because I'd expect the private one to be slow because it does a decrypt. |
Can reproduce. Let the debugging begin :-) |
Good news: Manyverse now with db2 1.17.1 has a startup time of 4s |
I'm copy pasting this table from #191 because it's useful and didn't want it to get lost.
|
Might be good to include these numbers in some form in the README.md documentation |
We can make another .md file for these |
What I conclude from maxCpu is that it basically slows down the task duration by 1.7x while providing better user experience. Have to factor that into the 4x speedup that db2 provided, meaning that in the end the db2 is 2.3x faster, but it's more like:
Temperature on mobile is a bigger deal because the device is in physical contact with the user. |
Not really an issue, I just want some place where I can report my analysis on the benchmark runs.
On my computer
npm run benchmark-no-create
from #107 reported8030ms
for theinitial indexing
test.In a profiler, the big picture looks like this:
There are roughly 4 parts there:
require
everything, set up secret-stack plugins, all that stuff (looks like spiky chaos on the left)createMissingIndexes
, and in this process it needs to do alog.stream
, and thislog.stream
performsdecrypt
and so forth, and then the final spike before the silence isgetBitsetForOperation
etcStats
Ideas
Separate benchmark reports for migration
Migration's duration should be measured separately from base indexing.
Don't strictly require indexing base index before db.query()
If db.query() would only use JITDB, we shouldn't block it from running until base index is done. In the example in
benchmark/index.js
, we don't use the base index, so we should be getting the db.query results much earlier.Allow customizing when should
decrypt
run insidelog.stream
For the
db.query
in this example, we didn't need to consider private messages. There should be a way of specifying that we don't needdecrypt
insidelog.stream
when indexing something in JITDB. In the ideal situation, JITDB would automatically detect whether it needs unboxing, and pass that down as a parameter tolog.stream
, but a manual opt-in might also be okay.Push-streams could/should be paused
If we're doing asynchronous work upon receiving an item from a push-stream, we should pause the push-stream until the async work is done, and then resume it. This would use the backpressure capabilities of push-stream. Probably we're not in deep trouble if we don't do this, but depending on the corner case, if we don't do backpressure, we might just accumulate or exhaust the computer's RAM of tasks and incoming data, and this can have adverse effects elsewhere.
cc @arj03
The text was updated successfully, but these errors were encountered: