forked from galaxyproject/iwc
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request galaxyproject#441 from clsiguret/bacterial_genome_…
…annotation Add 'Bacterial genome annotation' workflow
- Loading branch information
Showing
5 changed files
with
1,320 additions
and
0 deletions.
There are no files selected for viewing
19 changes: 19 additions & 0 deletions
19
workflows/bacterial_genomics/bacterial_genome_annotation/.dockstore.yml
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,19 @@ | ||
version: 1.2 | ||
workflows: | ||
- name: main | ||
subclass: Galaxy | ||
publish: true | ||
primaryDescriptorPath: /bacterial_genome_annotation.ga | ||
testParameterFiles: | ||
- /bacterial_genome_annotation-tests.yml | ||
authors: | ||
- name: ABRomics | ||
email: [email protected] | ||
- name: abromics-consortium | ||
url: https://www.abromics.fr/ | ||
- name: Pierre Marin | ||
alternateName: pimarin | ||
orcid: 0000-0002-8304-138X | ||
- name: Clea Siguret | ||
alternateName: clsiguret | ||
orcid: 0009-0005-6140-0379 |
5 changes: 5 additions & 0 deletions
5
workflows/bacterial_genomics/bacterial_genome_annotation/CHANGELOG.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,5 @@ | ||
# Changelog | ||
|
||
## [1.0] - 14-06-2024 | ||
|
||
- First release |
36 changes: 36 additions & 0 deletions
36
workflows/bacterial_genomics/bacterial_genome_annotation/README.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,36 @@ | ||
# Bacterial genome annotation workflow (v1.0) | ||
|
||
This workflow uses assembled bacterial genome fasta files (but can be any fasta file) and executes the following steps: | ||
1. Genomic annotation | ||
- **Bakta** to predict CDS and small proteins (sORF) | ||
2. Integron identification | ||
- **IntegronFinder2** to identify CALIN elements, In0 elements, and complete integrons | ||
3. Plasmid gene identification | ||
- **Plasmidfinder** to identify and typing plasmid sequences | ||
4. Inserted sequence (IS) detection | ||
- **ISEScan** to detect IS elements | ||
5. Aggregating outputs into a single JSON file | ||
- **ToolDistillator** to extract and aggregate information from different tool outputs to JSON parsable files | ||
|
||
## Inputs | ||
|
||
1. Assembled bacterial genome in fasta format. | ||
|
||
## Outputs | ||
|
||
1. Genomic annotation: | ||
- genome annotation in tabular, gff and several other formats | ||
- annotation plot | ||
- nucleotide and protein sequences identified | ||
- summary of genomic identified elements | ||
2. Integron identification: | ||
- integron identification in tabular format and a summary | ||
3. Plasmid gene identification: | ||
- plasmid gene identified and associated blast hits | ||
4. Inserted Element (IS) detection: | ||
- IS element list in tabular format | ||
- is hits in fasta format | ||
- ORF hits in protein and nucleotide fasta format | ||
- IS annotation gff format | ||
5. Aggregating outputs: | ||
- JSON file with information about the outputs of **Bakta**, **IntegronFinder2**, **Plasmidfinder**, **ISEScan** |
61 changes: 61 additions & 0 deletions
61
...lows/bacterial_genomics/bacterial_genome_annotation/bacterial_genome_annotation-tests.yml
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,61 @@ | ||
- doc: Test outline for bacterial_genome_annotation.ga | ||
job: | ||
Input sequence fasta: | ||
class: File | ||
path: https://zenodo.org/records/11488310/files/shovill_contigs_fasta | ||
Select a plasmid detection database: plasmidfinder_81c11f4_2023_12_04 | ||
Select a bacterial genome annotation database: V5.0_2023-02-20 | ||
Select a AMRFinderPlus database: amrfinderplus_V3.12_2024-05-02.2 | ||
outputs: | ||
integronfinder2_logfile_text: | ||
assert: | ||
has_text: | ||
text: "Writing out results for replicon" | ||
integronfinder2_summary: | ||
assert: | ||
has_n_columns: | ||
n: 6 | ||
integronfinder2_results_tabular: | ||
assert: | ||
has_n_columns: | ||
n: 14 | ||
bakta_hypothetical_tabular: | ||
assert: | ||
has_n_columns: | ||
n: 9 | ||
bakta_annotation_json: | ||
assert: | ||
has_text: | ||
text: "aa_hexdigest" | ||
bakta_annotation_tabular: | ||
assert: | ||
has_n_columns: | ||
n: 9 | ||
isescan_results_tabular: | ||
assert: | ||
has_n_columns: | ||
n: 24 | ||
isescan_summary_tabular: | ||
assert: | ||
has_text: | ||
text: "nIS" | ||
isescan_logfile_text: | ||
assert: | ||
has_text: | ||
text: "Both complete and partial IS elements are reported." | ||
plasmidfinder_result_json: | ||
assert: | ||
has_text: | ||
text: "positions_in_contig" | ||
plasmidfinder_results_tabular: | ||
assert: | ||
has_n_columns: | ||
n: 8 | ||
tooldistillator_summarize: | ||
assert: | ||
has_text: | ||
text: "CDS12738(DOp1)" | ||
has_text: | ||
text: "CALIN" | ||
has_text: | ||
text: "insertion_sequence" |
Oops, something went wrong.