-
Notifications
You must be signed in to change notification settings - Fork 63
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
alignments track performance optimization #969
Comments
Possible avenues of performance improvement:
|
At least one candidate might be optimizing mismatch calculations. Doesn't discount considering other methods but mismatch calculations do take a lot of profiling time Note that this is from viewing left-heavy view of a profile.json from performance trace on chrome in https://speedscope.app/ |
Note that you can also view different threads by clicking the top (see how it is looking at only the webworker thread where it says DedicatedWorker (3/6) |
For comparing JBrowse vs igv these links can be used, it puts the same file head to head JBrowse 2 link http://localhost:3000/?config=test_data%2Fconfig.json&session=share-SMJRxmTDfB&password=K6ZqY These links are good performance tests because the BAI can fit into cache, so after you get the BAI file in cache, then you can test some of the raw page speed just doing a page reload on each link In JBrowse the raw time from page load to displaying the results is about 10-12s, with IGV it's about 3-4 seconds If there is interest, more performance investigation could be done but these links help put it the pages head to head |
Loading long nanopore reads in BAM format can really can demonstrate slowdowns. There is large memory pressure and the GC takes up a significant portion of stack traces Probably partly network speed, but largely program time Two sources of almost half the time are (program) and (garbage collector) take up 33% and 27% (35seconds and 30seconds respectively) of time I don't know how much of that can be trimmed off but it pops out strongly in our profiling Example trace attached, see https://www.speedscope.app/ for info The speedscope app screenshots show all the calls that "arrive at" (program) and (garbage collector) |
Note that it goes a bit faster without performance tracing on but still about 60 seconds |
I think performance improvements on BAM would go a long way. CRAM is already a fair bit faster but it would actually be better if both BAM and CRAM were improved. Due to the fact that many things end up "rerendering" e.g. side scroll, height changes on snpcoverage, new axis calculated, full rerender happens...etc. it can be really good to make rerendering as optimal as possible. Reducing rerenderings is half the battle, but optimizing the render is really important too I found this awhile ago and it intrigued me https://github.com/ocxtal/udon @ihh would be curious what you think because you also considered compressed CIGAR type strings |
Is there any way we could split this issue into specific things to be done to the code? Do we know enough now for specific recommendations? |
It is challenging to deliver actionable recommendations at this stage. I think more compressed in memory representations would be valuable on some level because there is large "GC pressure" observed in the code from what I can tell, but I don't know exactly what will deliver on that yet. The Udon project listed above is a cool data structure but there may be other ways that would work too |
I think it would be worth making a measured benchmark including BAM and CRAM of jbrowse 2 vs jbrowse 1 There was a concern on the mailing list about jbrowse 2 being a slower and we should know the numbers regarding this |
My hypothesis is that memory pressure from complete serialization of features could be a factor, and suggests that shared array buffer or rpc feature details might be beneficial |
I added some performance benchmarking for the embedded mode, it actually performs a bit faster than the webworker in some cases, but could be worth digging into. Reproducible cra with embedded here https://github.com/cmdcolin/jb2_lgv_benchmarking_demo |
One of the things that can be somewhat unexpected behavior is sometimes a sort of "recalculating" state that the tracks go through. The code will often recalculate the score for example of a SNPCoverageAdapter, and if it finds out the score has changed, then it updates stats and fires off multiple new block updates which go through the full process of getFeatures from the data adapter. This results in a feeling of slowness compared to, perhaps, having the features cached and quickly updating the rendering |
Another area where performance differences are clear are things like sorting and coloring. These operations are nearly instantaneous on IGV, but on jbrowse 2, with a large alignments track, you can expect to wait 30s or so Would recommend trying IGV to see |
At least one slow factor is slowness of calling readConfObject. This is called multiple times for each read and may explain why short read datasets can be slower than long read datasets |
Not sure I have a lot to add but we just implemented jbrowse2 showing nanopore data on yeast genomes (using IGV a lot in parallel) and the performance of the alignment view was surprisingly slow. Panned it can completely break the browser, but even gene or several gene scale viewing of 30x data can result in 5-10 second delays in chrome panning around or switching to a new location. THe data fetches are sub half second, but the remaining time is very CPU heavy (200% usage typically) rendering (~0.65Gb RAM). I did some profiling but not super familiar with JS browsing and didn't see any names of jbrowse functions (lots of anonymous funcs). If anyone has guidance on best profiling practices, I would be happy to build or share a profiling or bam data set. This is probably a controversional observation, but wondering if the data sets are just slightly beyond the capacity of real time processing in a browser - even IGV is a bit sluggish with a lot of RAM. We switched to wig files for coverage plots so users can leave the alignments off and move around, but they really need to see the alignment plots. I'm guessing profiling and performance optimization is the path forward, but wondering if other tooling like WebAsm (I'll show myself out now), or techniques like Deep zoom and prerendering a bunch of stuff and storing it are the path forward for alignment viewing. Let me know how I can be helpful - we are mainly F# devs, transpiling with the awesome Fable tool, so less facile with raw JS, although @alfonsogarciacaro might have some thoughts there. |
@daz10000 definitely good to know. this issue is really important and we want to make sure the alignments tracks are speedy. things like webassembly is definitely not out of the question. prerendering could be an option for some specialized track type but we would want to try to avoid that for most cases, since sometimes you don't have control and the files are so big...avoid another data conversion step if possible. if you are interested in sharing the dataset we'd be happy to look at it Also do you know if you are using the "@jbrowse/react-linear-genome-view" or the full jbrowse-web app with webworkers by chance? |
I think it's @jbrowse/react-linear-genome-view |
Yes, the app @daz10000 mentions uses the @jbrowse/react-linear-genome-view. It's a slight variation of this demo. We use the React component because we need some customization, and it's my understanding that this is not possible with the full jbrowse-web app. Ideally we wanted to integrate JBrowse into our bigger app but we had some troubles with the build and right now is a separate frontend app but with custom selectors for genomes and genes. Wasn't aware the React component didn't use webworkers. It'd be nice to enable them although I assume that unless there's a good way to parallelize the work the main benefit of the webworkers would be not blocking the UI, but it would still take time to render the regions when scrolling. |
#2523 adds some modest improvements, depending on your data. pending next release deep short read sequencing e.g. 1000x coverage could get as much as a 10x speedup |
There have been modest speedups added here and there and we will continue to work on it One challenge is high coverage sequencing, say the mitocondrial genome This file has https://s3.amazonaws.com/jbrowse.org/genomes/hg19/HG002.hs37d5.2x250.bam 560MB of data on the chrMT chromosome alone Trying to visit this track in our config_demo crashes the browser. It has about 55,000-60,000x coverage (calculated by mosdepth) |
This track is in our config_demo.json for reference as "HG002 Illumina hs37d5.2x250" |
For reference, the above MT genome also crashes igv.js. It is simpler gigantic to unzip 560MB of BAM, and we don't have methods to lazily parse that |
This issue could maybe be considered closed for now. I am happy with the performance improvements we have made to alignments tracks, particularly with the removal of serialization helped jbrowse-web/jbrowse-desktop. For better embedded performance maybe need to get support for workers in embedded mode #2942 |
Do a round of performance optimization on the Alignments track. Run some profiling, try to figure out where time is being spent, and try to improve the time it takes to render large alignments.
Main categories of stuff that takes time:
@gmod/bam-js
or@gmod/cram-js
Might be worth it to run some profiling on igv.js to compare/contrast where it is spending its time vs. where JB2 is spending time.
Might also compare BAM vs CRAM, to see how they differ.
Main deliverable for this is to know where time is being spent and develop a prioritized list of optimizations that we should do.
The text was updated successfully, but these errors were encountered: