-
Notifications
You must be signed in to change notification settings - Fork 2
Home
Welcome to the Talos wiki!
This is pre-publication software that is currently under active development. Use it at your own risk. Bug reports are welcome, but a user cannot depend on getting support at this time.
Pipeline for analyzing genomic read sets for public and animal health purposes.
Author: Thomas Haverkamp, @thomieh73
Contact information: please submit an issue, and the author will get back to you.
This software uses the Nextflow.io workflow system to run various analyses appropriate for the quality control of Illumina shotgun metagenomic sequence data. The Nextflow system allows for running the same pipeline on a local computer as well as on a High Processing Cluster without changing the code.
For installation, see the installation pages. Please note: this software has at the time of writing (May 2020) not been tested on any other systems than MacOS Sierra and on the Saga compute cluster (i.e. under slurm).
The pipeline has been developed as a series of scripts, where each script has a specific input and a set of logically connected analyses. Each script comes with its own nextflow script and a separate config file, which is used to specify inputs and software options for that specific run.
The current pipeline contains the following scripts:
-
01_run_quality_check.nf
: Basic Quality control of metagenomic sequences-
Fastqc
is run on all input files, followed bymultiqc
, which aggregates the results. -
Non-pareil 3
is run on all input files. This tool estimates the sequencing depth of all samples. The results are vizualized graphically. Nonpareil github repository.
-
-
02_simple_run.nf
: Data cleanup and calculation of various statistic and Taxonomic classification.- Calculation of sequencing depth/coverage with Non-Pareil 3
- Calculation of Average Genome Sizes for each sample with microbecencus
- Calculation of distances between the samples using Hulk to identify if datasets are comparable
- Kraken 2 classification of shotgun reads. (not yet implemented, but in progress)
See the run pages.
The following features are planned for future releases:
- Bracken abundance estimation
- Metagenomic assembly
- Analysis of Anti Microbial resistance and Virulence genes presence.