Skip to content

Releases: mozack/abra2

v2.14

14 Feb 17:20
Compare
Choose a tag to compare
  • Avoid crash when known variant VCF not specified for RNA.
  • Limit number of possible junctions in a region.
  • Skip extremely high depth sites when calling.

v2.13

05 Feb 16:31
Compare
Choose a tag to compare
  • Improved handling of known indel sites in RNA
  • Minor SAM spec corrections for STAR bams
  • Support reading observed junctions directly from bam input
  • Do not attempt to use reads with length of zero (from Lenbok)
  • Added --gc option. Skips remapping if eligible regions do not contain at least one contig with an indel or splice junction (experimental)
  • Added option to not remap reads with new cigar containing deletion bracketed by introns (experimental)

v2.12

03 Nov 18:26
Compare
Choose a tag to compare
  • Cadabra - Close SamReader after processing chromosome (avoid too many open files errors)
  • Added guard around indel component size of 0 for complex indels
  • Ignore track and browser in bed file (from dr-artio)

v2.11

03 Oct 15:31
Compare
Choose a tag to compare
  • Decreased memory footprint during final sort / mate fix stage
  • Stop dropping reads realigned across chunk boundaries
  • Assign mate info properly when one half of fragment was originally unmapped
  • Adhere to VCF spec when calling variants

v2.10

28 Sep 15:05
Compare
Choose a tag to compare
  • Correct issues related to single end processing
  • Avoid NegativeArraySizeException when small target regions specified

Known issues:

  • Occasional reads being dropped / mate not updated not yet resolved.

v2.09

15 Sep 19:30
Compare
Choose a tag to compare

Note: Previous releases were compiled on Centos 6. This release is compiled on Centos 7.

  • Don't allow read to map beyond the end of a chromosome
  • Default GKL to off. Appears to cause stability issues in some cases. (Re-enable using --gkl)
  • Implemented disk backed sort / mate fix. Should address out of memory errors during sort phase.
  • Added additional logging around read buffer problems causing potential crashes
  • Update edit distance using STAR's nM tag when appropriate.
  • Improved handling of exon skipping junctions
  • Cadabra: Improved handling of reference bias when calling at repeat locations

Known issues:

  • Still seeing a couple of reads with mate not updated properly in WGS test.
  • A small number of reads are being dropped in RNA testing.

v2.08

17 Aug 16:44
Compare
Choose a tag to compare

Speed and accuracy are generally improved in this release (particularly for RNA).

  • Merge and error correct overlapping reads for use in assembly / contig generation.
  • Improvements to assembly triggers. Assembly is much less frequent now.
  • Avoid adapter read through triggering uncessary assemblies.
  • More sensitive assembly in place (less aggressive pruning / downsampling)
  • Improved handling of unannotated exon skipping junctions.
  • Default --dist to 1000 for DNA (Improves RAM usage when sorting large files). Recommend using 500000 for RNA to allow movement across large introns.
  • Handle STAR's nM tag
  • Improved options for Cadabra
  • Skip decoy chromosomes / unplaced contigs, etc. by default.
  • Identify smallest repeat unit when calculating repeat period
  • RNA speed optimization : Limit # contigs in the face of large number of junction permutations
  • Improvement to ISPAN annotation for inserts
  • Fixed bug where RNA regions where being skipped due to large splice junctions
  • Added option to downgrade mapping quality when a read maps equally well to reference and a contig (potentially useful for downstream callers that handle STRs / repeats naively).
  • Treat deletion adjacent to intron as a potential alternative intron.
  • Avoid religning very noisy reads

v2.07

06 Jun 19:54
Compare
Choose a tag to compare
  • Allow remap of unclipped portion of soft clipped reads when entire read does not realign
  • Update read mate info properly (output BAM files should now pass Picard's ValidateSamFile )
  • Align non-assembled contigs to all junction combos
  • Default mapq filter to 20
  • Don't allow contig to exceed max buffer size (was causing occasional segmentation faults)
  • Added --nosort option. Use to disable final sorting (and mate fixing) step.
  • Added --dist option. Use to shrink max distance a read can be moved (also impacts sorting)

v2.06

17 May 19:31
Compare
Choose a tag to compare
  • Now using more conservative assembly triggers
  • Added trigger for more sensitive assembly (enables modest improvement for some longer inserts)
  • Accomodate reads spanning multiple indels for observed indel contig generation
  • Correction to affine gap penalty calculations in contig alignment
  • Contigs now left aligned during semi-global alignment instead of post-hoc
  • Ported semi-global aligner to C (for speed improvement)
  • Parameterized final BAM compression level (consider using 1 for intermediate files that are to be deleted)
  • Parameterize anchor length / mismatches (consider using this along with --cons for amplicon data)
  • Parameterize max reads per region

v2.05

24 Apr 15:36
Compare
Choose a tag to compare
  • Enable usage of high quality soft clipping and observed indels to generate putative contigs by default
    ** Note: parameters around this behavior have changed

  • Remap reads equal to min mapq

  • Output unmapped read pairs in final BAM

  • Added deBruijn graph simplification

  • Cadabra - only use primary alignments for calling

  • Various speed optimizations
    ** More conservative assembly triggers
    ** Optimization of read to contig mapping
    ** sparse_hash --> dense_hash
    ** Introduce smaller chromosome "chunks" to improve parallelization
    ** Parallelize final BAM sorting/writing in multi-BAM cases
    ** More selective permuting of regional splice junctions combinations
    ** Discard low scoring contigs during assembly
    ** Cadabra - added multi-threading