Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FBMN] Adding support for SIRIUS processing. Updating Optimus and MetaboScape converters #715

Draft
wants to merge 4 commits into
base: master
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion feature-based-molecular-networking/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -3,5 +3,5 @@ include ../Makefile.deploytemplate

WORKFLOW_NAME=feature-based-molecular-networking
TOOL_FOLDER_NAME=feature-based-molecular-networking
WORKFLOW_VERSION=release_28.2
WORKFLOW_VERSION=release_29
WORKFLOW_DESCRIPTION='Feature-Based Molecular Networking (FBMN) is a computational method that bridges popular mass spectrometry data processing tools for LC-MS/MS and molecular networking analysis on GNPS. The supported tools are: MZmine, OpenMS, MS-DIAL, MetaboScape, XCMS, Progenesis QI, and the mzTab-M format. FBMN facilitates the detection of isomers that are separated by chromatographic or ion mobility separation, and provides accurate ion abundances for statistical analysis. Note that FBMN requires processing the mass spectrometry data with a feature detection and alignment tool. For rapid/qualitative analysis, we recommend using classical molecular networking that accepts unprocessed mass spectrometry files. See the FBMN documentation at https://ccms-ucsd.github.io/GNPSDocumentation/featurebasedmolecularnetworking and refer to the "Method and Citation for Manuscripts" on the results page for citations.'
54 changes: 47 additions & 7 deletions feature-based-molecular-networking/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -39,18 +39,48 @@ Additionally, it is assumed there are additional columns where the per sample qu

The MGF output should contain the "SCANS" header, and it must correspond to the identifier of the "row ID". It has to be unique, and can be non sequential.

### Metaboscape
### MetaboScape

#### For MetaboScape 5.0

The feature quantification table (.CSV file, comma separated) should include columns with the following header:

1. FEATURE_ID
2. RT
3. PEPMASS
4. MaxIntensity
1. SHARED_NAME
2. FEATURE_ID
3. RT
4. PEPMASS
5. CCS (optional, only tims/PASEF data)
6. SIGMA_SCORE
7. NAME_METABOSCAPE
8. MOLECULAR_FORMULA
9. ADDUCT
10. KEGG
11. CAS
12. MaxIntensity
13. {GroupName}_MeanIntensity (0-n times, dependent on the groups defined in MetaboScape)
14. Sample Intensities

All sample headers are not including the file format extension ".d" (DDA) or ".tdf" (PASEF). The columns "FEATURE_ID", "RT", "PEPMASS", "MaxIntensity" are mandatory.
Important: In the metadata table, the filename MUST NOT HAVE the extension suffixe indicated.

#### Earlier versions of MetaboScape (<5.0)

For ion mobility data, it must include a "CCS" column.
The feature quantification table (.CSV file, comma separated) should include columns with the following header:

All sample headers are not including the file format extension ".d" (DDA) or ".tdf" (PASEF)
1. SHARED_NAME
2. FEATURE_ID
3. RT
4. PEPMASS
5. NAME
6. MOLECULAR_FORMULA
7. ADDUCT
8. KEGG
9. CAS
10. {GroupName}_MeanIntensity (0-n times, dependent on the groups defined in MetaboScape)
11. Sample Intensities

Sample headers are including the file format extension ".d". The columns "FEATURE_ID", "RT", "PEPMASS", "CAS" are mandatory.
Important: In the metadata table, the filename MUST HAVE the ".d" extensionsuffixe.

### Progenesis QI

Expand Down Expand Up @@ -91,3 +121,13 @@ The feature quantification table (.TXT format, tab-separated) should include a h
Following these headers are the samples.

The MGF output should contain the "SCANS" header, and it must correspond to the identifier of the "row ID". It has to be unique, and can be non sequential.

### SIRIUS

The feature quantification table (.CSV file, comma separated) should have three columns named:

1. row ID
2. row m/z
3. row retention time

The native sample headers from SIRIUS don't include the "Peak area" suffix, so the converter add that suffix for internal processing.
Original file line number Diff line number Diff line change
Expand Up @@ -165,12 +165,13 @@
<options>
<option value="MZMINE2" label="MZmine"/>
<option value="OPENMS" label="OpenMS"/>
<option value="OPTIMUS" label="Optimus"/>
<option value="OPTIMUS" label="Optimus (Legacy)"/>
<option value="MSDIAL" label="MS-DIAL"/>
<option value="METABOSCAPE" label="MetaboScape"/>
<option value="XCMS3" label="XCMS"/>
<option value="PROGENESIS" label="Progenesis QI"/>
<option value="MZTABM" label="MzTab-M"/>
<option value="SIRIUS" label="SIRIUS (Experimental)"/>
</options>
<validator type="set"/>
<default value="MZMINE2"/>
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -19,5 +19,7 @@ d0b5249f7d69443a8adcb9e43ebcc0a2,fbmn IIN,view_all_clusters_withID
8c6b9e069b5843229e0902fd55086022,fbmn with blank molecules pre-removed,
d513e6923920413a89f1669b16117e1d,fbmn with mzML files,file_summary
f932ba2238c04499b48c696008c33672,fbmn with IIN collapse,
6aa41748ea3c4872aca7f5e86c7f69c4,fbmn mzmine2 with topk library search,
9f2c15c889504df38d7730243b7b0619,fbmn mzmine2 with topk library search,
debbc6b5e28445789005d465226661e5,fbmn with no library matches,
debbc6b5e28445789005d465226661e5,fbmn with no library matches,
333295a1b04c4267863a1f8f7cbf6364,fbmn SIRIUS,view_all_clusters_withID
Loading