Skip to content

Parameter choices

Doga C. Gulhan edited this page Nov 13, 2022 · 8 revisions

You can train a classifier with SigMA without relying on built-in classifiers and for different choices of signature catalogs.

Using built-in classifiers

Tumor type and data tags

Choice of cosmic catalog

For Sig3 prediction with gradient boosting classifiers use cosmic_v2_inhouse which contains v2 COSMIC catalog with the addition of signatures discovered from WGS data that did not match the catalog.

For MMRD prediction using predict_mmrd function using cosmic_v3_inhouse which contains v3 COSMIC catalog with the addition of the same signatures as in cosmic_v2_inhouse.

Other available catalogs, that can be used in the training of new models:

  • cosmic_v3p2: COSMIC catalog v3.2
  • cosmic_v3p2_inhouse: COSMIC catalog v3.2 with added signatures as described in the following preprint.

Which tumor type and data settings are compatible with a built in multivariate Sig3 classifier?

Using list_tumor_types() function you can see the options for tumor_type parameter for the run() function, similarly use list_data_options() to see the available data parameters. You can see if the SNV counts in your data agree with the mutation counts in the datasets for tuning the gradient boosting classifiers by running an info(data, tumor_type). If the values disagree the cutoffs on the score listed in Signature_3_mva column of SigMA output need to be optimized, or a new model needs to be tuned.

Using SigMA without Sig3 classifier

SigMA can be used without an MVA classifier. If there is no built-in model available Sig3 can still be studied by setting do_mva = F and do_assign = F in run() function. If you want to investigate the presence of Signature_3 set add_sig3 = T which will then allow the assignment of Signature 3 to the tumors even if this signature was not discovered by NMF in the WGS data for these tumor types, if it is already present for that tumor type in the WGS data no changes will be made.

Clone this wiki locally