alignments track performance optimization #969

rbuels · 2020-05-29T18:38:40Z

Do a round of performance optimization on the Alignments track. Run some profiling, try to figure out where time is being spent, and try to improve the time it takes to render large alignments.

Main categories of stuff that takes time:

downloading the data
parsing the data using @gmod/bam-js or @gmod/cram-js
rendering in the worker onto an OffscreenCanvas
shipping the features, layout, and rendering to the main thread
deserializing the features, layout, and rendering on the main thread (mostly native code that we don't see)
hydrating the rendering on the main thread (ServerSideRenderedContent.js)

Might be worth it to run some profiling on igv.js to compare/contrast where it is spending its time vs. where JB2 is spending time.

Might also compare BAM vs CRAM, to see how they differ.

Main deliverable for this is to know where time is being spent and develop a prioritized list of optimizations that we should do.

The text was updated successfully, but these errors were encountered:

rbuels · 2020-05-29T18:45:36Z

Possible avenues of performance improvement:

do fewer things
- only render the alignments that are actually visible on the screen for the view settings
- don't fully parse the mismatches if the view is zoomed out too far to see them well
- avoid buffer copying in generic-filehandle, Possible unnecessary data copying from remotefile/blobfile generic-filehandle#20
do things differently
- render using webgl+shaders instead of canvas
- use a wasm build of samtools or some other wasm-based BAM parser
- TransferableFeatureSet for faster shipping over the RPC boundary (make TransferableFeatureSet, Feature.toArrayBuffer #416)

cmdcolin · 2020-06-01T19:07:00Z

At least one candidate might be optimizing mismatch calculations. Doesn't discount considering other methods but mismatch calculations do take a lot of profiling time

Note that this is from viewing left-heavy view of a profile.json from performance trace on chrome in https://speedscope.app/

cmdcolin · 2020-06-01T19:10:34Z

Note that you can also view different threads by clicking the top (see how it is looking at only the webworker thread where it says DedicatedWorker (3/6)

cmdcolin · 2020-06-30T18:19:02Z

For comparing JBrowse vs igv these links can be used, it puts the same file head to head

IGV link https://igv.org/app/?sessionURL=blob:rZNfa9swFMW_StHTBo5sxynBflxh68Oesow9jBBk.9rWqj_ulZw_DfnuvXLT0UDpui2BBKIrHen.zrkHhtAAgqmAFQcma1awrk1zFjEjNK2x20ELc_XhZnHTZfM41D5SsRHOi..Lr2G7970r4thlXGjxYI3YOl5ZHct2w0u0opbGeekHD9xiG7dgrAYXO7gf5cYfPgqSsDQ17C4uTL.SxKu9t6Uw9YX0g9wnkuN.59kxYspWg2PFT1Z1mBbpNEry2STNomyesVXEPIrqLtQPzO_7wJaUhhF9xCzWgKyY5EkyT_N8ej2bz5I8T8kH.62XxoSqxwGO0eHZGgRRu7XCtbsrMaMe16bVCicJn_JsrUXfQ81dr6T3gI6XQgfnpIL_Om9RC0.nn_4OqN5C.Ys4bt05v5HdeGX8Ty94T0Qufi99Q4KcHbCC5ZN9ASWtncwUSrZGg_Fhds49i1gHsu2IWpYkFEO7ARQtLEMgbk.VayoIpUDBZ4T7ZYfgOqtoHulNv.ORvXB_AQ3l5.oLGHAvfaGBpo7Dw_7gTYg58eFnfOh0UOTOoicGlGzePvzFXL5PkPsy0NxIJ0tJlPc_SN9uKf8UeARtN6IktkUjlCN6p.7TZPy8gvcs1K838JZzxlgvvLSG1rTYLSg6wRC6iTZSAsKp58G22jrqjnaOlFfH1fER

JBrowse 2 link http://localhost:3000/?config=test_data%2Fconfig.json&session=share-SMJRxmTDfB&password=K6ZqY

These links are good performance tests because the BAI can fit into cache, so after you get the BAI file in cache, then you can test some of the raw page speed just doing a page reload on each link

In JBrowse the raw time from page load to displaying the results is about 10-12s, with IGV it's about 3-4 seconds

If there is interest, more performance investigation could be done but these links help put it the pages head to head

cmdcolin · 2020-09-29T23:10:49Z

Loading long nanopore reads in BAM format can really can demonstrate slowdowns. There is large memory pressure and the GC takes up a significant portion of stack traces

Loading this took 120 seconds for me http://localhost:3000/?config=test_data%2Fconfig_demo.json&session=eJztVltv2jAU_ivIT5vEJYRLS95opxa0rkOFtdKmCZnkkHhL7Mx2uCr_fcdJBmFAtWnVnvpQ1Jzb9_n4fLa3hNMIiEPuYVkZg1JM8Ipt2VbN6tXs3sS2nVbX6bTqnYvmZ1IlEZU-48SxqsSTdAnyiXk6IE7rsl0lCwZLRZwvW8I8U7LVD-WjzYaY5zEVh3R9n4MFfrOHRr2Ozdcd40DlLXARwSOWQI-YzxXo0Yo4Xbvb7LbsyyqZxSOQxtTclQPvAXxknINKmBf1m1hCaSp1RhQ4srHbPbtjdW1MlrAAqQCNcxoqqBKqFESz8IBd-hX5Sep-Ly2Iw7s2Y1a02XPvh8znEXCtJiYYHQEwP0BgRKsSV_A58xNJNbI0FSgXsZAwTUIsHgruY8aIhZDEeb5TQDWH79tJLOGhtccqx-1xmtYxzvZ0SraeoXeSxzTOQqertVldMRSDW8uyK_dFcGUXXHlz1f_wtpKnkMMGmn7lLcQOulSDL-Ta2G6H_Stjox6NNcgSyysa9QsjbjON7oS7W0gimdkSrWPlNBqqVacR3QhOl6ruiqjxbSbFUkFdSL_hZxOkGga8kXGtGbI1wfU0UK0LrzONA4obX0cMklYJ4x6sDEj4HwDxj5EUURX8SIC70D_ug79h8Q3Fyd23Y24-zzfkLJ0sL_u3Pqd1f2PWO6fsBSrhLzPVkOxLVMMy2JbUyBK3Q6IyTbV8cB8K09FE7xyYN174N0B1gmI5Dj_hTDO08f3oWuA5QH04VJ60xOCTy31rL7yj2L362s-I70TaswpUPHaL-L-SYSnv37VYIr0fQpXMXlX7qtpj1f4u29L0nBLjCW-aNzcEV5vb3FgZ3ooOMWRVIJa_MoijZQK5LT8Ecou5qAPmwQCoZ9CKO31v-oj55mWyc2UiHGeQQk5yagFDEOkGzKUhyUEyzd7RGYSqjH2Nlz1I82YpChoCS-b5oLMWlCtNyki7E-Z8xO7EGZwLecqAiDl0wpDG2Stma4aEhchqAittWpc_xQ6eYKbN1NVsAU9_xPUZX5odeRw_cIiGHCcDdWHqpelPAnWShA

Probably partly network speed, but largely program time

Two sources of almost half the time are (program) and (garbage collector) take up 33% and 27% (35seconds and 30seconds respectively) of time

I don't know how much of that can be trimmed off but it pops out strongly in our profiling

Example trace attached, see https://www.speedscope.app/ for info
trace.zip

The speedscope app screenshots show all the calls that "arrive at" (program) and (garbage collector)

cmdcolin · 2020-09-29T23:33:53Z

Note that it goes a bit faster without performance tracing on but still about 60 seconds

cmdcolin · 2020-11-05T03:27:13Z

I think performance improvements on BAM would go a long way. CRAM is already a fair bit faster but it would actually be better if both BAM and CRAM were improved. Due to the fact that many things end up "rerendering" e.g. side scroll, height changes on snpcoverage, new axis calculated, full rerender happens...etc. it can be really good to make rerendering as optimal as possible. Reducing rerenderings is half the battle, but optimizing the render is really important too

I found this awhile ago and it intrigued me

https://github.com/ocxtal/udon

@ihh would be curious what you think because you also considered compressed CIGAR type strings

cmdcolin · 2020-11-05T03:55:31Z

this is a typical trace where someone has to wait 30-40 seconds to have it render a neighboring block even when the data is downloaded (note that this is snpcoverage and pileup)

rbuels · 2020-11-24T17:30:08Z

Is there any way we could split this issue into specific things to be done to the code? Do we know enough now for specific recommendations?

cmdcolin · 2020-11-24T22:33:22Z

It is challenging to deliver actionable recommendations at this stage. I think more compressed in memory representations would be valuable on some level because there is large "GC pressure" observed in the code from what I can tell, but I don't know exactly what will deliver on that yet. The Udon project listed above is a cool data structure but there may be other ways that would work too

cmdcolin · 2021-01-27T18:43:23Z

I think it would be worth making a measured benchmark including BAM and CRAM of jbrowse 2 vs jbrowse 1

There was a concern on the mailing list about jbrowse 2 being a slower and we should know the numbers regarding this

cmdcolin · 2021-01-27T18:47:09Z

My hypothesis is that memory pressure from complete serialization of features could be a factor, and suggests that shared array buffer or rpc feature details might be beneficial

cmdcolin · 2021-02-02T00:15:27Z

Some benchmarking here https://docs.google.com/document/d/1p06gr7osdVpkrZ5Qa2ut_C5fnhHiQAov0mUykhKt9Ek/edit?usp=sharing

cmdcolin · 2021-02-05T23:38:59Z

I added some performance benchmarking for the embedded mode, it actually performs a bit faster than the webworker in some cases, but could be worth digging into. Reproducible cra with embedded here https://github.com/cmdcolin/jb2_lgv_benchmarking_demo

cmdcolin · 2021-05-17T15:39:35Z

One of the things that can be somewhat unexpected behavior is sometimes a sort of "recalculating" state that the tracks go through. The code will often recalculate the score for example of a SNPCoverageAdapter, and if it finds out the score has changed, then it updates stats and fires off multiple new block updates which go through the full process of getFeatures from the data adapter. This results in a feeling of slowness compared to, perhaps, having the features cached and quickly updating the rendering

cmdcolin · 2021-07-27T20:59:14Z

Another area where performance differences are clear are things like sorting and coloring. These operations are nearly instantaneous on IGV, but on jbrowse 2, with a large alignments track, you can expect to wait 30s or so

Would recommend trying IGV to see

cmdcolin · 2021-08-30T13:32:32Z

At least one slow factor is slowness of calling readConfObject. This is called multiple times for each read and may explain why short read datasets can be slower than long read datasets

daz10000 · 2021-09-14T19:55:18Z

Not sure I have a lot to add but we just implemented jbrowse2 showing nanopore data on yeast genomes (using IGV a lot in parallel) and the performance of the alignment view was surprisingly slow. Panned it can completely break the browser, but even gene or several gene scale viewing of 30x data can result in 5-10 second delays in chrome panning around or switching to a new location. THe data fetches are sub half second, but the remaining time is very CPU heavy (200% usage typically) rendering (~0.65Gb RAM). I did some profiling but not super familiar with JS browsing and didn't see any names of jbrowse functions (lots of anonymous funcs). If anyone has guidance on best profiling practices, I would be happy to build or share a profiling or bam data set. This is probably a controversional observation, but wondering if the data sets are just slightly beyond the capacity of real time processing in a browser - even IGV is a bit sluggish with a lot of RAM. We switched to wig files for coverage plots so users can leave the alignments off and move around, but they really need to see the alignment plots. I'm guessing profiling and performance optimization is the path forward, but wondering if other tooling like WebAsm (I'll show myself out now), or techniques like Deep zoom and prerendering a bunch of stuff and storing it are the path forward for alignment viewing.

Let me know how I can be helpful - we are mainly F# devs, transpiling with the awesome Fable tool, so less facile with raw JS, although @alfonsogarciacaro might have some thoughts there.

cmdcolin · 2021-09-14T20:09:16Z

@daz10000 definitely good to know. this issue is really important and we want to make sure the alignments tracks are speedy. things like webassembly is definitely not out of the question. prerendering could be an option for some specialized track type but we would want to try to avoid that for most cases, since sometimes you don't have control and the files are so big...avoid another data conversion step if possible. if you are interested in sharing the dataset we'd be happy to look at it

Also do you know if you are using the "@jbrowse/react-linear-genome-view" or the full jbrowse-web app with webworkers by chance?

ihh · 2021-09-14T20:12:11Z

I think it's @jbrowse/react-linear-genome-view

alfonsogarciacaro · 2021-09-15T06:59:03Z

Yes, the app @daz10000 mentions uses the @jbrowse/react-linear-genome-view. It's a slight variation of this demo. We use the React component because we need some customization, and it's my understanding that this is not possible with the full jbrowse-web app. Ideally we wanted to integrate JBrowse into our bigger app but we had some troubles with the build and right now is a separate frontend app but with custom selectors for genomes and genes.

Wasn't aware the React component didn't use webworkers. It'd be nice to enable them although I assume that unless there's a good way to parallelize the work the main benefit of the webworkers would be not blocking the UI, but it would still take time to render the regions when scrolling.

cmdcolin · 2021-11-17T23:00:49Z

#2523 adds some modest improvements, depending on your data. pending next release

deep short read sequencing e.g. 1000x coverage could get as much as a 10x speedup
other datasets may generally get a more modest e.g. 20% speedup
these stats calculated from our main app with webworkers, though benefits the proportional improvements likely apply to embedded also

cmdcolin · 2022-03-23T21:00:44Z

There have been modest speedups added here and there and we will continue to work on it

One challenge is high coverage sequencing, say the mitocondrial genome

This file has https://s3.amazonaws.com/jbrowse.org/genomes/hg19/HG002.hs37d5.2x250.bam 560MB of data on the chrMT chromosome alone

Trying to visit this track in our config_demo crashes the browser. It has about 55,000-60,000x coverage (calculated by mosdepth)

cmdcolin · 2022-03-23T21:01:42Z

This track is in our config_demo.json for reference as "HG002 Illumina hs37d5.2x250"

cmdcolin · 2022-04-27T19:16:44Z

For reference, the above MT genome also crashes igv.js. It is simpler gigantic to unzip 560MB of BAM, and we don't have methods to lazily parse that

cmdcolin · 2022-04-27T19:19:57Z

This issue could maybe be considered closed for now. I am happy with the performance improvements we have made to alignments tracks, particularly with the removal of serialization helped jbrowse-web/jbrowse-desktop.

For better embedded performance maybe need to get support for workers in embedded mode #2942

rbuels added the enhancement New feature or request label May 29, 2020

cmdcolin self-assigned this Jun 9, 2020

cmdcolin mentioned this issue Jun 9, 2020

BAM parsing optimization #993

Merged

rbuels unassigned cmdcolin Jun 12, 2020

cmdcolin mentioned this issue Aug 9, 2020

Slow on loading a large nanopore BAM file #594

Closed

cmdcolin mentioned this issue Jun 3, 2021

Alignments layout code slow under MainThreadRpcDriver #2038

Closed

cmdcolin mentioned this issue Sep 14, 2021

main thread hangs more than expected in cancer demo embedding section #1591

Closed

cmdcolin mentioned this issue Nov 16, 2021

Performance optimizations for alignments tracks, particularly those with many short reads #2523

Merged

cmdcolin closed this as completed Apr 27, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

alignments track performance optimization #969

alignments track performance optimization #969

rbuels commented May 29, 2020 •

edited

Loading

rbuels commented May 29, 2020 •

edited

Loading

cmdcolin commented Jun 1, 2020

cmdcolin commented Jun 1, 2020

cmdcolin commented Jun 30, 2020 •

edited

Loading

cmdcolin commented Sep 29, 2020

cmdcolin commented Sep 29, 2020

cmdcolin commented Nov 5, 2020

cmdcolin commented Nov 5, 2020

rbuels commented Nov 24, 2020 •

edited

Loading

cmdcolin commented Nov 24, 2020

cmdcolin commented Jan 27, 2021

cmdcolin commented Jan 27, 2021

cmdcolin commented Feb 2, 2021

cmdcolin commented Feb 5, 2021

cmdcolin commented May 17, 2021

cmdcolin commented Jul 27, 2021

cmdcolin commented Aug 30, 2021

daz10000 commented Sep 14, 2021

cmdcolin commented Sep 14, 2021

ihh commented Sep 14, 2021

alfonsogarciacaro commented Sep 15, 2021

cmdcolin commented Nov 17, 2021

cmdcolin commented Mar 23, 2022

cmdcolin commented Mar 23, 2022

cmdcolin commented Apr 27, 2022

cmdcolin commented Apr 27, 2022

alignments track performance optimization #969

alignments track performance optimization #969

Comments

rbuels commented May 29, 2020 • edited Loading

rbuels commented May 29, 2020 • edited Loading

cmdcolin commented Jun 1, 2020

cmdcolin commented Jun 1, 2020

cmdcolin commented Jun 30, 2020 • edited Loading

cmdcolin commented Sep 29, 2020

cmdcolin commented Sep 29, 2020

cmdcolin commented Nov 5, 2020

cmdcolin commented Nov 5, 2020

rbuels commented Nov 24, 2020 • edited Loading

cmdcolin commented Nov 24, 2020

cmdcolin commented Jan 27, 2021

cmdcolin commented Jan 27, 2021

cmdcolin commented Feb 2, 2021

cmdcolin commented Feb 5, 2021

cmdcolin commented May 17, 2021

cmdcolin commented Jul 27, 2021

cmdcolin commented Aug 30, 2021

daz10000 commented Sep 14, 2021

cmdcolin commented Sep 14, 2021

ihh commented Sep 14, 2021

alfonsogarciacaro commented Sep 15, 2021

cmdcolin commented Nov 17, 2021

cmdcolin commented Mar 23, 2022

cmdcolin commented Mar 23, 2022

cmdcolin commented Apr 27, 2022

cmdcolin commented Apr 27, 2022

rbuels commented May 29, 2020 •

edited

Loading

rbuels commented May 29, 2020 •

edited

Loading

cmdcolin commented Jun 30, 2020 •

edited

Loading

rbuels commented Nov 24, 2020 •

edited

Loading