From ae8db5c266081742cd17fe114496cb55b4f15b0a Mon Sep 17 00:00:00 2001 From: Bethan Yates Date: Wed, 20 Nov 2024 15:00:10 +0000 Subject: [PATCH 1/3] Updated docs ahead of release --- CHANGELOG.md | 26 ++++++++++++++++++++++++++ CITATIONS.md | 8 ++++++++ docs/usage.md | 2 +- nextflow.config | 2 +- 4 files changed, 36 insertions(+), 2 deletions(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index 9ed60a41..d03a9b77 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -3,6 +3,32 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/) and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html). +## [[2.1.0](https://github.com/sanger-tol/genomenote/releases/tag/2.1.0)] - Pembroke Welsh Corgi [2024-11-20] + +### Enhancements & fixes + +- New annotation_statistics subworkfow which runs BUSCO in protein mode and generates some basic statistics on the the annotated gene set if provided with a GFF3 file of gene annotations using the `--annotation_set` option. +- The genome_metadata subworkflow now queries Ensembl's GraphQL API to determine if Ensembl has released gene annotation for the assembly being processed. + +### Parameters + +| Old parameter | New parameter | +| ------------- | ---------------- | +| | --annotation_set | + +> **NB:** Parameter has been **updated** if both old and new parameter information is present.
**NB:** Parameter has been **added** if just the new parameter information is present.
**NB:** Parameter has been **removed** if new parameter information isn't present. + +### Software dependencies + +Note, since the pipeline is using Nextflow DSL2, each process will be run with its own [Biocontainer](https://biocontainers.pro/#/registry). This means that on occasion it is entirely possible for the pipeline to be using different versions of the same tool. However, the overall software dependency changes compared to the last release have been listed below for reference. + +| Dependency | Old version | New version | +| ---------- | ----------- | ----------- | +| gffread | | 0.12.7 | +| agat | | 1.4.0 | + +> **NB:** Dependency has been **updated** if both old and new version information is present.
**NB:** Dependency has been **added** if just the new version information is present.
**NB:** Dependency has been **removed** if version information isn't present. + ## [[2.0.0](https://github.com/sanger-tol/genomenote/releases/tag/2.0.0)] - English Cocker Spaniel [2024-10-10] ### Enhancements & fixes diff --git a/CITATIONS.md b/CITATIONS.md index 8bff25a0..392829c8 100644 --- a/CITATIONS.md +++ b/CITATIONS.md @@ -12,6 +12,10 @@ ## Pipeline tools +- [AGAT](https://github.com/NBISweden/AGAT) + + > Dainat J. AGAT: Another Gff Analysis Toolkit to handle annotations in any GTF/GFF format. (Version v1.4.0). Zenodo. https://www.doi.org/10.5281/zenodo.3552717 + - [BedTools](https://bedtools.readthedocs.io/en/latest/) > Quinlan, Aaron R., and Ira M. Hall. “BEDTools: A Flexible Suite of Utilities for Comparing Genomic Features.” Bioinformatics, vol. 26, no. 6, 2010, pp. 841–842., https://doi.org/10.1093/bioinformatics/btq033. @@ -30,6 +34,10 @@ - [FastK](https://github.com/thegenemyers/FASTK) +- [GFFREAD](https://github.com/gpertea/gffread) + + > Pertea G and Pertea M. "GFF Utilities: GffRead and GffCompare [version 1; peer review: 3 approved]". F1000Research 2020, 9:304 https://doi.org/10.12688/f1000research.23297.1 + - [MerquryFK](https://github.com/thegenemyers/MERQURY.FK) - [MultiQC](https://multiqc.info) diff --git a/docs/usage.md b/docs/usage.md index 9e98d4d3..f8baa6ee 100644 --- a/docs/usage.md +++ b/docs/usage.md @@ -30,7 +30,7 @@ You will need to supply the assembly accession for the genome you would like to ## Annotation input -If you want to generate statistics on the set of proteins annotated for the assembly you will need to supply a GFF3 file of the predicted protein sequences. The assembly region names used in this file must match the assembly regions names used in the assembly fasta file provided with --fasta +If you want to generate statistics on the geneset annotated for the assembly you will need to supply a GFF3 file of the predicted gene sequences. The assembly region names used in this file must match the assembly regions names used in the assembly fasta file provided with --fasta ```bash --annotation_set '[Path to annotation file :gff] diff --git a/nextflow.config b/nextflow.config index 25151ebc..636d52e5 100644 --- a/nextflow.config +++ b/nextflow.config @@ -243,7 +243,7 @@ manifest { description = """Creating standarised genome assembly publications""" mainScript = 'main.nf' nextflowVersion = '!>=22.10.1' - version = '2.0.0' + version = '2.1.0' doi = '10.5281/zenodo.7949384' } From bbfada72181c24dc34c84392bb14ae062814126f Mon Sep 17 00:00:00 2001 From: Bethan Yates Date: Thu, 5 Dec 2024 18:17:08 +0000 Subject: [PATCH 2/3] prettier fix --- CHANGELOG.md | 21 +++++++++------------ 1 file changed, 9 insertions(+), 12 deletions(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index 5dbacd52..6aa48c9f 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -3,7 +3,6 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/) and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html). - ## [[2.1.0](https://github.com/sanger-tol/genomenote/releases/tag/2.1.0)] - Pembroke Welsh Corgi [2024-12-05] ### Enhancements & fixes @@ -23,18 +22,17 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 Note, since the pipeline is using Nextflow DSL2, each process will be run with its own [Biocontainer](https://biocontainers.pro/#/registry). This means that on occasion it is entirely possible for the pipeline to be using different versions of the same tool. However, the overall software dependency changes compared to the last release have been listed below for reference. Only `Docker` or `Singularity` containers are supported, `conda` is not supported. -| Dependency | Old version | New version | -|-------------|--------------------------|--------------------------| -| `agat` | | 1.4.0 | -| `bedtools` | 2.30.0 | 2.31.1 | -| `busco` | 5.5.0 | 5.7.1 | -| `cooler` | 0.8.11 | 0.9.2 | +| Dependency | Old version | New version | +| ----------- | ---------------------------------------- | ---------------------------------------- | +| `agat` | | 1.4.0 | +| `bedtools` | 2.30.0 | 2.31.1 | +| `busco` | 5.5.0 | 5.7.1 | +| `cooler` | 0.8.11 | 0.9.2 | | `fastk` | 427104ea91c78c3b8b8b49f1a7d6bbeaa869ba1c | 666652151335353eef2fcd58880bcef5bc2928e1 | -| `gffread` | | 0.12.7 | +| `gffread` | | 0.12.7 | | `merquryfk` | d00d98157618f4e8d1a9190026b19b471055b22e | 666652151335353eef2fcd58880bcef5bc2928e1 | -| `multiqc` | 1.14 | 1.25.1 | -| `samtools` | 1.17 | 1.21 | - +| `multiqc` | 1.14 | 1.25.1 | +| `samtools` | 1.17 | 1.21 | > **NB:** Dependency has been **updated** if both old and new version information is present.
**NB:** Dependency has been **added** if just the new version information is present.
**NB:** Dependency has been **removed** if version information isn't present. @@ -68,7 +66,6 @@ Note, since the pipeline is using Nextflow DSL2, each process will be run with i | | --higlass_upload_directory | | | --higlass_data_project_dir | - ## [[1.2.2](https://github.com/sanger-tol/genomenote/releases/tag/1.2.2)] - Pyrenean Mountain Dog (patch 2) - [2024-09-10] ### Enhancements & fixes From e448f756e106b41230134114485b1c4b6f25085e Mon Sep 17 00:00:00 2001 From: Tyler Chafin Date: Mon, 9 Dec 2024 09:53:09 +0000 Subject: [PATCH 3/3] Update Utils.groovy --- lib/Utils.groovy | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/lib/Utils.groovy b/lib/Utils.groovy index 8d030f4e..13cb02df 100755 --- a/lib/Utils.groovy +++ b/lib/Utils.groovy @@ -22,7 +22,7 @@ class Utils { // Check that all channels are present // This channel list is ordered by required channel priority. - def required_channels_in_order = ['conda-forge', 'bioconda', 'defaults'] + def required_channels_in_order = ['conda-forge', 'bioconda'] def channels_missing = ((required_channels_in_order as Set) - (channels as Set)) as Boolean // Check that they are in the right order