Skip to content

Releases: sanger-tol/ensemblrepeatdownload

v.2.0.1 - Shadowfax the Planerider (patch 1)

09 Dec 10:24
66ab344
Compare
Choose a tag to compare

Enhancements & fixes

  • Update module versions
  • Remove reference to Anaconda repositories
  • Remove defaults from lib/Utils.groovy

Software dependencies

Note, since the pipeline is using Nextflow DSL2, each process will be run with its own Biocontainer. This means that on occasion it is entirely possible for the pipeline to be using different versions of the same tool. However, the overall software dependency changes compared to the last release have been listed below for reference. Only Docker or Singularity containers are supported, conda is not supported.

Dependency Old version New version
Python 3.8.3,3.9.1 3.9.1
samtools 1.17 1.21
tabix 1.11 1.20

v2.0.0 – Shadowfax the Planerider

04 Jun 08:48
21757b3
Compare
Choose a tag to compare

This version supports the new FTP structure of Ensembl

Enhancements & fixes

  • Support for the updated directory structure of the Ensembl FTP
  • Relative paths in the sample-sheet are now evaluated from the --outdir parameter
  • Memory usage rules for samtools dict
  • Appropriate use of tabix's TBI and CSI indexing, depending on the sequence lengths
  • New command-line parameter (--annotation_method): required for accessing the files on the Ensembl FTP
  • --outdir is a mandatory parameter

Parameters

Old parameter New parameter
--annotation_method

In the samplesheet

Old parameter New parameter
species_dir outdir
annotation_method
assembly_name

NB: Parameter has been updated if both old and new parameter information is present.
NB: Parameter has been added if just the new parameter information is present.
NB: Parameter has been removed if new parameter information isn't present.

Software dependencies

Note, since the pipeline is using Nextflow DSL2, each process will be run with its own Biocontainer. This means that on occasion it is entirely possible for the pipeline to be using different versions of the same tool. However, the overall software dependency changes compared to the last release have been listed below for reference. Only Docker or Singularity containers are supported, conda is not supported.

Dependency Old version New version
multiqc 1.13 1.14

v1.0.0 – Gwaihir the Windlord

19 Oct 00:54
Compare
Choose a tag to compare

Overview

The pipeline takes a CSV file that contains assembly accession number, Ensembl species names (as they may differ from Tree of Life ones !), output directories.
Assembly accession numbers are optional too. If missing, the pipeline assumes it can be retrieved from files named ACCESSION in the standard location on disk.
The pipeline downloads the repeat annotation as the masked Fasta file and a BED file.
All files are compressed with bgzip, and indexed with samtools faidx or tabix.

Steps involved:

  • Download the masked fasta file from Ensembl.
  • Extract the coordinates of the masked regions into a BED file.
  • Compress and index the BED file with bgzip and tabix.

Dependencies

All dependencies are automatically fetched by Singularity.

  • bgzip
  • samtools
  • tabix
  • python3
  • wget
  • awk
  • gzip