Skip to content

Commit

Permalink
Merge branch 'main' into add-all-public-servers
Browse files Browse the repository at this point in the history
  • Loading branch information
paulzierep committed Jun 5, 2024
2 parents 26011da + 639672e commit 6f1cf9d
Show file tree
Hide file tree
Showing 4 changed files with 253 additions and 10 deletions.
9 changes: 7 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,8 +6,13 @@ Galaxy Tool Metadata Extractor
![plot](docs/images/Preprint_flowchart.png)


This tool automatically collects a table of all available Galaxy tools including their metadata. The created table
can be filtered to only show the tools relevant for a specific community.
This tool automatically collects a table of all available Galaxy tools including their metadata. Therefore, various sources are parsed to collect the metadata, such as:
* github (parsing each tool wrapper)
* bio.tools
* bioconda
* Galaxy instances (availability, statistics)

The created table can be filtered to only show the tools relevant for a specific community.

Any Galaxy community can be added to this project and benefit from a dedicated interactive table that can be embedded into subdomains and website via an iframe. **Learn [how to add your community](https://training.galaxyproject.org/training-material//topics/dev/tutorials/community-tool-table/tutorial.html) in the dedicated GTN toturial**.

Expand Down
9 changes: 6 additions & 3 deletions bin/extract_galaxy_tools.py
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@
BIOTOOLS_API_URL = "https://bio.tools"
# BIOTOOLS_API_URL = "https://130.226.25.21"

USEGALAXY_STAR_SERVER_URLS = {
USEGALAXY_SERVER_URLS = {
"UseGalaxy.org (Main)": "https://usegalaxy.org",
"UseGalaxy.org.au": "https://usegalaxy.org.au",
"UseGalaxy.eu": "https://usegalaxy.eu",
Expand Down Expand Up @@ -735,8 +735,11 @@ def reduce_ontology_terms(terms: List, ontology: Any) -> List:
)
tool["EDAM topic (no superclasses)"] = reduce_ontology_terms(tool["EDAM topic"], ontology=edam_ontology)

# add availability for UseGalaxy servers
for name, url in USEGALAXY_SERVER_URLS.items():
tool[f"Available on {name}"] = check_tools_on_servers(tool["Galaxy tool ids"], url)
# add availability for all UseGalaxy servers
for name, url in USEGALAXY_STAR_SERVER_URLS.items():
for name, url in USEGALAXY_SERVER_URLS.items():
tool[f"Tools available on {name}"] = check_tools_on_servers(tool["Galaxy tool ids"], url)

# add all other available servers
Expand All @@ -745,7 +748,7 @@ def reduce_ontology_terms(terms: List, ontology: Any) -> List:
name = row["name"]

if name.lower() not in [
n.lower() for n in USEGALAXY_STAR_SERVER_URLS.keys()
n.lower() for n in USEGALAXY_SERVER_URLS.keys()
]: # do not query UseGalaxy servers again

url = row["url"]
Expand Down
234 changes: 234 additions & 0 deletions results/test.list_tools.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,234 @@
[
{
"Galaxy wrapper id": "2d_auto_threshold",
"Galaxy tool ids": [
"ip_threshold"
],
"Description": "Automatic thresholding",
"bio.tool id": "scikit-image",
"bio.tool ids": [
"scikit-image"
],
"biii": "scikit-image",
"bio.tool name": "scikit-image",
"bio.tool description": "Scikit-image contains image processing algorithms for SciPy, including IO, morphology, filtering, warping, color manipulation, object detection, etc.",
"EDAM operation": [
"Image analysis",
"Image annotation",
"Visualisation",
"Data handling"
],
"EDAM topic": [
"Imaging",
"Software engineering",
"Literature and language"
],
"Status": "To update",
"Source": "https://github.com/bmcv",
"ToolShed categories": [
"Imaging"
],
"ToolShed id": "2d_auto_threshold",
"Galaxy wrapper owner": "imgteam",
"Galaxy wrapper source": "https://github.com/BMCV/galaxy-image-analysis/tree/master/tools/2d_auto_threshold/",
"Galaxy wrapper parsed folder": "https://github.com/paulzierep/Galaxy-Tool-Metadata-Extractor-Test-Wrapper/tree/main/tools/2d_auto_threshold",
"Galaxy wrapper version": "0.0.6-2",
"Conda id": "scikit-image",
"Conda version": null,
"EDAM operation (no superclasses)": [
"Image analysis",
"Image annotation",
"Visualisation",
"Data handling"
],
"EDAM topic (no superclasses)": [
"Imaging",
"Software engineering",
"Literature and language"
],
"Tools available on UseGalaxy.org": 0,
"Tools available on UseGalaxy.org.au": 1,
"Tools available on UseGalaxy.eu": 1,
"Tools available on UseGalaxy.org.fr": 1
},
{
"Galaxy wrapper id": "abritamr",
"Galaxy tool ids": [
"abritamr"
],
"Description": "A pipeline for running AMRfinderPlus and collating results into functional classes",
"bio.tool id": null,
"bio.tool ids": [],
"biii": null,
"bio.tool name": null,
"bio.tool description": null,
"EDAM operation": [],
"EDAM topic": [],
"Status": "To update",
"Source": "https://zenodo.org/record/7370628",
"ToolShed categories": [
"Sequence Analysis"
],
"ToolShed id": "abritamr",
"Galaxy wrapper owner": "iuc",
"Galaxy wrapper source": "https://github.com/galaxyproject/tools-iuc/tree/master/tools/abritamr",
"Galaxy wrapper parsed folder": "https://github.com/paulzierep/Galaxy-Tool-Metadata-Extractor-Test-Wrapper/tree/main/tools/abritamr",
"Galaxy wrapper version": "1.0.14",
"Conda id": "abritamr",
"Conda version": "1.0.17",
"EDAM operation (no superclasses)": [],
"EDAM topic (no superclasses)": [],
"Tools available on UseGalaxy.org": 0,
"Tools available on UseGalaxy.org.au": 0,
"Tools available on UseGalaxy.eu": 1,
"Tools available on UseGalaxy.org.fr": 0
},
{
"Galaxy wrapper id": "aldex2",
"Galaxy tool ids": [
"aldex2"
],
"Description": "Performs analysis Of differential abundance taking sample variation into account",
"bio.tool id": "aldex2",
"bio.tool ids": [
"aldex2"
],
"biii": null,
"bio.tool name": "ALDEx2",
"bio.tool description": "A differential abundance analysis for the comparison of two or more conditions. It uses a Dirichlet-multinomial model to infer abundance from counts, that has been optimized for three or more experimental replicates. Infers sampling variation and calculates the expected FDR given the biological and sampling variation using the Wilcox rank test and Welches t-test, or the glm and Kruskal Wallis tests. Reports both P and fdr values calculated by the Benjamini Hochberg correction.",
"EDAM operation": [
"Statistical inference"
],
"EDAM topic": [
"Gene expression",
"Statistics and probability"
],
"Status": "To update",
"Source": "https://github.com/ggloor/ALDEx_bioc",
"ToolShed categories": [
"Metagenomics"
],
"ToolShed id": "aldex2",
"Galaxy wrapper owner": "iuc",
"Galaxy wrapper source": "https://github.com/galaxyproject/tools-iuc/tree/master/tools/aldex2",
"Galaxy wrapper parsed folder": "https://github.com/paulzierep/Galaxy-Tool-Metadata-Extractor-Test-Wrapper/tree/main/tools/aldex2",
"Galaxy wrapper version": "1.26.0",
"Conda id": "bioconductor-aldex2",
"Conda version": "1.34.0",
"EDAM operation (no superclasses)": [
"Statistical inference"
],
"EDAM topic (no superclasses)": [
"Gene expression",
"Statistics and probability"
],
"Tools available on UseGalaxy.org": 0,
"Tools available on UseGalaxy.org.au": 0,
"Tools available on UseGalaxy.eu": 1,
"Tools available on UseGalaxy.org.fr": 0
},
{
"Galaxy wrapper id": "fastp",
"Galaxy tool ids": [
"fastp"
],
"Description": "Fast all-in-one preprocessing for FASTQ files",
"bio.tool id": "fastp",
"bio.tool ids": [
"fastp"
],
"biii": null,
"bio.tool name": "fastp",
"bio.tool description": "A tool designed to provide fast all-in-one preprocessing for FastQ files. This tool is developed in C++ with multithreading supported to afford high performance.",
"EDAM operation": [
"Sequencing quality control",
"Sequence contamination filtering"
],
"EDAM topic": [
"Sequence analysis",
"Probes and primers"
],
"Status": "To update",
"Source": "https://github.com/OpenGene/fastp",
"ToolShed categories": [
"Sequence Analysis"
],
"ToolShed id": "fastp",
"Galaxy wrapper owner": "iuc",
"Galaxy wrapper source": "https://github.com/galaxyproject/tools-iuc/tree/master/tools/fastp",
"Galaxy wrapper parsed folder": "https://github.com/paulzierep/Galaxy-Tool-Metadata-Extractor-Test-Wrapper/tree/main/tools/fastp",
"Galaxy wrapper version": null,
"Conda id": "fastp",
"Conda version": "0.23.4",
"EDAM operation (no superclasses)": [
"Sequence contamination filtering"
],
"EDAM topic (no superclasses)": [
"Probes and primers"
],
"Tools available on UseGalaxy.org": 1,
"Tools available on UseGalaxy.org.au": 1,
"Tools available on UseGalaxy.eu": 1,
"Tools available on UseGalaxy.org.fr": 1
},
{
"Galaxy wrapper id": "spades",
"Galaxy tool ids": [
"spades_biosyntheticspades",
"spades_coronaspades",
"spades_metaplasmidspades",
"metaspades",
"spades_metaviralspades",
"spades_plasmidspades",
"rnaspades",
"spades_rnaviralspades",
"spades"
],
"Description": "SPAdes is an assembly toolkit containing various assembly pipelines. It implements the following 4 stages: assembly graph construction, k-bimer adjustment, construction of paired assembly graph and contig construction.",
"bio.tool id": "spades",
"bio.tool ids": [
"rnaviralspades",
"metaviralspades",
"metaspades",
"biosyntheticspades",
"metaplasmidspades",
"spades",
"rnaspades",
"plasmidspades",
"coronaspades"
],
"biii": null,
"bio.tool name": "SPAdes",
"bio.tool description": "St. Petersburg genome assembler \u2013 is intended for both standard isolates and single-cell MDA bacteria assemblies. SPAdes 3.9 works with Illumina or IonTorrent reads and is capable of providing hybrid assemblies using PacBio, Oxford Nanopore and Sanger reads. Additional contigs can be provided and can be used as long reads.",
"EDAM operation": [
"Genome assembly"
],
"EDAM topic": [
"Sequence assembly"
],
"Status": "Up-to-date",
"Source": "https://github.com/ablab/spades",
"ToolShed categories": [
"Assembly",
"RNA",
"Metagenomics"
],
"ToolShed id": "spades",
"Galaxy wrapper owner": "iuc",
"Galaxy wrapper source": "https://github.com/galaxyproject/tools-iuc/tree/master/tools/spades",
"Galaxy wrapper parsed folder": "https://github.com/paulzierep/Galaxy-Tool-Metadata-Extractor-Test-Wrapper/tree/main/tools/spades",
"Galaxy wrapper version": "3.15.5",
"Conda id": "spades",
"Conda version": "3.15.5",
"EDAM operation (no superclasses)": [
"Genome assembly"
],
"EDAM topic (no superclasses)": [
"Sequence assembly"
],
"Tools available on UseGalaxy.org": 9,
"Tools available on UseGalaxy.org.au": 9,
"Tools available on UseGalaxy.eu": 9,
"Tools available on UseGalaxy.org.fr": 9
}
]
11 changes: 6 additions & 5 deletions results/test.list_tools.tsv
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
Galaxy wrapper id Total tool usage (usegalaxy.eu) No. of tool users (2022-2023) (usegalaxy.eu) Galaxy tool ids Description bio.tool id biii bio.tool name bio.tool description EDAM operation EDAM topic Status Source ToolShed categories ToolShed id Galaxy wrapper owner Galaxy wrapper source Galaxy wrapper parsed folder Galaxy wrapper version Conda id Conda version https://usegalaxy.org https://usegalaxy.org.au https://usegalaxy.eu
2d_auto_threshold 6541.0 39.0 ip_threshold Automatic thresholding scikit-image scikit-image scikit-image Scikit-image contains image processing algorithms for SciPy, including IO, morphology, filtering, warping, color manipulation, object detection, etc. Image analysis, Image annotation, Visualisation, Data handling Imaging, Software engineering, Literature and language To update https://github.com/bmcv Imaging 2d_auto_threshold imgteam https://github.com/BMCV/galaxy-image-analysis/tree/master/tools/2d_auto_threshold/ https://github.com/paulzierep/Galaxy-Tool-Metadata-Extractor-Test-Wrapper/tree/main/tools/2d_auto_threshold 0.0.6-2 scikit-image (0/1) (1/1) (1/1)
abritamr abritamr A pipeline for running AMRfinderPlus and collating results into functional classes To update https://zenodo.org/record/7370628 Sequence Analysis abritamr iuc https://github.com/galaxyproject/tools-iuc/tree/master/tools/abritamr https://github.com/paulzierep/Galaxy-Tool-Metadata-Extractor-Test-Wrapper/tree/main/tools/abritamr 1.0.14 abritamr 1.0.17 (0/1) (0/1) (0/1)
aldex2 129.0 13.0 aldex2 Performs analysis Of differential abundance taking sample variation into account aldex2 ALDEx2 A differential abundance analysis for the comparison of two or more conditions. It uses a Dirichlet-multinomial model to infer abundance from counts, that has been optimized for three or more experimental replicates. Infers sampling variation and calculates the expected FDR given the biological and sampling variation using the Wilcox rank test and Welches t-test, or the glm and Kruskal Wallis tests. Reports both P and fdr values calculated by the Benjamini Hochberg correction. Statistical inference Gene expression, Statistics and probability To update https://github.com/ggloor/ALDEx_bioc Metagenomics aldex2 iuc https://github.com/galaxyproject/tools-iuc/tree/master/tools/aldex2 https://github.com/paulzierep/Galaxy-Tool-Metadata-Extractor-Test-Wrapper/tree/main/tools/aldex2 1.26.0 bioconductor-aldex2 1.34.0 (0/1) (0/1) (1/1)
fastp 1055760.0 2803.0 fastp Fast all-in-one preprocessing for FASTQ files fastp fastp A tool designed to provide fast all-in-one preprocessing for FastQ files. This tool is developed in C++ with multithreading supported to afford high performance. Sequencing quality control, Sequence contamination filtering Sequence analysis, Probes and primers To update https://github.com/OpenGene/fastp Sequence Analysis fastp iuc https://github.com/galaxyproject/tools-iuc/tree/master/tools/fastp https://github.com/paulzierep/Galaxy-Tool-Metadata-Extractor-Test-Wrapper/tree/main/tools/fastp fastp 0.23.4 (1/1) (1/1) (1/1)
Galaxy wrapper id Total tool usage (usegalaxy.eu) No. of tool users (2022-2023) (usegalaxy.eu) Galaxy tool ids Description bio.tool id bio.tool ids biii bio.tool name bio.tool description EDAM operation EDAM topic Status Source ToolShed categories ToolShed id Galaxy wrapper owner Galaxy wrapper source Galaxy wrapper parsed folder Galaxy wrapper version Conda id Conda version EDAM operation (no superclasses) EDAM topic (no superclasses) Tools available on UseGalaxy.org Tools available on UseGalaxy.org.au Tools available on UseGalaxy.eu Tools available on UseGalaxy.org.fr
2d_auto_threshold 6541.0 39.0 ip_threshold Automatic thresholding scikit-image scikit-image scikit-image scikit-image Scikit-image contains image processing algorithms for SciPy, including IO, morphology, filtering, warping, color manipulation, object detection, etc. Image analysis, Image annotation, Visualisation, Data handling Imaging, Software engineering, Literature and language To update https://github.com/bmcv Imaging 2d_auto_threshold imgteam https://github.com/BMCV/galaxy-image-analysis/tree/master/tools/2d_auto_threshold/ https://github.com/paulzierep/Galaxy-Tool-Metadata-Extractor-Test-Wrapper/tree/main/tools/2d_auto_threshold 0.0.6-2 scikit-image Image analysis, Image annotation, Visualisation, Data handling Imaging, Software engineering, Literature and language 0 1 1 1
abritamr abritamr A pipeline for running AMRfinderPlus and collating results into functional classes To update https://zenodo.org/record/7370628 Sequence Analysis abritamr iuc https://github.com/galaxyproject/tools-iuc/tree/master/tools/abritamr https://github.com/paulzierep/Galaxy-Tool-Metadata-Extractor-Test-Wrapper/tree/main/tools/abritamr 1.0.14 abritamr 1.0.17 0 0 1 0
aldex2 129.0 13.0 aldex2 Performs analysis Of differential abundance taking sample variation into account aldex2 aldex2 ALDEx2 A differential abundance analysis for the comparison of two or more conditions. It uses a Dirichlet-multinomial model to infer abundance from counts, that has been optimized for three or more experimental replicates. Infers sampling variation and calculates the expected FDR given the biological and sampling variation using the Wilcox rank test and Welches t-test, or the glm and Kruskal Wallis tests. Reports both P and fdr values calculated by the Benjamini Hochberg correction. Statistical inference Gene expression, Statistics and probability To update https://github.com/ggloor/ALDEx_bioc Metagenomics aldex2 iuc https://github.com/galaxyproject/tools-iuc/tree/master/tools/aldex2 https://github.com/paulzierep/Galaxy-Tool-Metadata-Extractor-Test-Wrapper/tree/main/tools/aldex2 1.26.0 bioconductor-aldex2 1.34.0 Statistical inference Gene expression, Statistics and probability 0 0 1 0
fastp 1055760.0 2803.0 fastp Fast all-in-one preprocessing for FASTQ files fastp fastp fastp A tool designed to provide fast all-in-one preprocessing for FastQ files. This tool is developed in C++ with multithreading supported to afford high performance. Sequencing quality control, Sequence contamination filtering Sequence analysis, Probes and primers To update https://github.com/OpenGene/fastp Sequence Analysis fastp iuc https://github.com/galaxyproject/tools-iuc/tree/master/tools/fastp https://github.com/paulzierep/Galaxy-Tool-Metadata-Extractor-Test-Wrapper/tree/main/tools/fastp fastp 0.23.4 Sequence contamination filtering Probes and primers 1 1 1 1
spades 58834.0 2309.0 spades_biosyntheticspades, spades_coronaspades, spades_metaplasmidspades, metaspades, spades_metaviralspades, spades_plasmidspades, rnaspades, spades_rnaviralspades, spades SPAdes is an assembly toolkit containing various assembly pipelines. It implements the following 4 stages: assembly graph construction, k-bimer adjustment, construction of paired assembly graph and contig construction. spades rnaviralspades, metaviralspades, metaspades, biosyntheticspades, metaplasmidspades, spades, rnaspades, plasmidspades, coronaspades SPAdes St. Petersburg genome assembler – is intended for both standard isolates and single-cell MDA bacteria assemblies. SPAdes 3.9 works with Illumina or IonTorrent reads and is capable of providing hybrid assemblies using PacBio, Oxford Nanopore and Sanger reads. Additional contigs can be provided and can be used as long reads. Genome assembly Sequence assembly Up-to-date https://github.com/ablab/spades Assembly, RNA, Metagenomics spades iuc https://github.com/galaxyproject/tools-iuc/tree/master/tools/spades https://github.com/paulzierep/Galaxy-Tool-Metadata-Extractor-Test-Wrapper/tree/main/tools/spades 3.15.5 spades 3.15.5 Genome assembly Sequence assembly 9 9 9 9

0 comments on commit 6f1cf9d

Please sign in to comment.