Releases: sanger-tol/ensemblrepeatdownload
v.2.0.1 - Shadowfax the Planerider (patch 1)
Enhancements & fixes
- Update module versions
- Remove reference to Anaconda repositories
- Remove defaults from lib/Utils.groovy
Software dependencies
Note, since the pipeline is using Nextflow DSL2, each process will be run with its own Biocontainer. This means that on occasion it is entirely possible for the pipeline to be using different versions of the same tool. However, the overall software dependency changes compared to the last release have been listed below for reference. Only Docker
or Singularity
containers are supported, conda
is not supported.
Dependency | Old version | New version |
---|---|---|
Python |
3.8.3,3.9.1 | 3.9.1 |
samtools |
1.17 | 1.21 |
tabix |
1.11 | 1.20 |
v2.0.0 – Shadowfax the Planerider
This version supports the new FTP structure of Ensembl
Enhancements & fixes
- Support for the updated directory structure of the Ensembl FTP
- Relative paths in the sample-sheet are now evaluated from the
--outdir
parameter - Memory usage rules for
samtools dict
- Appropriate use of
tabix
's TBI and CSI indexing, depending on the sequence lengths - New command-line parameter (
--annotation_method
): required for accessing the files on the Ensembl FTP --outdir
is a mandatory parameter
Parameters
Old parameter | New parameter |
---|---|
--annotation_method |
In the samplesheet
Old parameter | New parameter |
---|---|
species_dir | outdir |
annotation_method | |
assembly_name |
NB: Parameter has been updated if both old and new parameter information is present.
NB: Parameter has been added if just the new parameter information is present.
NB: Parameter has been removed if new parameter information isn't present.
Software dependencies
Note, since the pipeline is using Nextflow DSL2, each process will be run with its own Biocontainer. This means that on occasion it is entirely possible for the pipeline to be using different versions of the same tool. However, the overall software dependency changes compared to the last release have been listed below for reference. Only Docker
or Singularity
containers are supported, conda
is not supported.
Dependency | Old version | New version |
---|---|---|
multiqc | 1.13 | 1.14 |
v1.0.0 – Gwaihir the Windlord
Overview
The pipeline takes a CSV file that contains assembly accession number, Ensembl species names (as they may differ from Tree of Life ones !), output directories.
Assembly accession numbers are optional too. If missing, the pipeline assumes it can be retrieved from files named ACCESSION
in the standard location on disk.
The pipeline downloads the repeat annotation as the masked Fasta file and a BED file.
All files are compressed with bgzip
, and indexed with samtools faidx
or tabix
.
Steps involved:
- Download the masked fasta file from Ensembl.
- Extract the coordinates of the masked regions into a BED file.
- Compress and index the BED file with
bgzip
andtabix
.
Dependencies
All dependencies are automatically fetched by Singularity.
- bgzip
- samtools
- tabix
- python3
- wget
- awk
- gzip