Skip to content

Commit

Permalink
updating code to enable more simulated data
Browse files Browse the repository at this point in the history
  • Loading branch information
sgosline committed Apr 21, 2021
1 parent 2c0969c commit a11b307
Show file tree
Hide file tree
Showing 6 changed files with 55 additions and 16 deletions.
21 changes: 15 additions & 6 deletions perfEval/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ to run an algorithm included in our suite:
| sampleType | no | Optional argument to describe sample type (e.g. tumor or normal|


## Tests of various algorithms
## Implemented algorithm metrics
Below are the three different tests we perform. For each test, there are numerous
metrics we use to evaluate the performance as well as different parameters.

Expand All @@ -35,11 +35,20 @@ Here we measure how sensitive an algorithm is to imputed vs. unimputed proteomic
The documentation to evaluate this is in the [`imputation` directory](./imputation).

### Simulated data analysis
Here we tests how well each algorithm performs on simulated data.
Here we test how well each algorithm performs on simulated data.
The documentation to test this is in the [`data-sim` directory](./data-sim).

### Immune subtype analysis
We also evaluate how well the various cell types agree with what is expected based on the mRNA-defined immune subtypes.

## Vector and matrix comparisons
The following are options:
- Vector comparisons: these measure correlation across patients OR subtypes using spearman or pearson Correlation
- Matrix comparisons: these measure distances between matrices.
## How to determine agreement
How we compare the deconvolution algorithms to the 'gold standard' of any particular approach is just as important as what data we are using. As such, we have carefully thought through the various approaches. Here are the current comparisons we employ.

### Per sample correlations
For each sample from the original matrix, we evaluate how well the cell type predictions agree between the deconvoluted protein matrix and the 'test' scenario. This test can be run using the [`deconv-corr-cwl-tool.cwl`](./correlations/deconv-corr-cwl-tool.cwl) script.

### Cell type correlations
For each cell type in the original matrix, we evaluate how well the predictions for that cell type agree across samples between the deconvoluted matrix and the 'test' scenario. This test can be run using the [`deconv-corrXcelltypes-cwl-tool.cwl`](./correlations/deconv-corrXcelltypes-cwl-tool.cwl] script.

### Matrix distance metrics
To compare two matrices, we employ a number of pairwise distance metrics to determine if the two matrices are similar or not. (song to add more details here)
13 changes: 10 additions & 3 deletions perfEval/data-sim/call-deconv-on-sim.cwl
Original file line number Diff line number Diff line change
Expand Up @@ -58,6 +58,13 @@ steps:
dataType: dataType
matrix: get-sim-data/matrix
out: [deconvoluted]
match-prot-to-sig:
run: ../../simulatedData/map-sig-tool.cwl
in:
deconv-matrix: deconv-prot/deconvoluted
sig-matrix: signature
cell-matrix: get-sim-data/cellType
out: [updated-deconv]
patient-cor:
run: ../correlations/deconv-corr-cwl-tool.cwl
in:
Expand All @@ -68,7 +75,7 @@ steps:
signature: signature
sampleType: sampleType
proteomics:
source: deconv-prot/deconvoluted
source: match-prot-to-sig/updated-deconv
transcriptomics:
source: get-sim-data/cellType
out: [corr]
Expand All @@ -82,14 +89,14 @@ steps:
signature: signature
sampleType: sampleType
proteomics:
source: deconv-prot/deconvoluted
source: match-prot-to-sig/updated-deconv
transcriptomics:
source: get-sim-data/cellType
out: [corr]
matrix-distance:
run: ../comparison/deconv-comparison-tool.cwl
in:
matrixA: deconv-prot/deconvoluted
matrixA: match-prot-to-sig/updated-deconv
matrixB: get-sim-data/cellType
cancerType: permutation
aAlg: prot-alg
Expand Down
4 changes: 0 additions & 4 deletions perfEval/mrna-prot/alg-test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -4,19 +4,15 @@ cancerTypes:
prot-algorithms:
- mcpcounter
- epic
- xcell
- cibersort
- repbulk
mrna-algorithms:
- mcpcounter
- epic
- xcell
- cibersort
- repbulk
tissueTypes:
- tumor
signatures:
- class: File
path: ../../signature_matrices/LM7c.txt
- class: File
path: ../../signature_matrices/LM22.txt
2 changes: 1 addition & 1 deletion perfEval/mrna-prot/call-deconv-and-cor.cwl
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@ outputs:

steps:
deconv-mrna:
run: ../mrna-deconv.cwl
run: mrna-deconv.cwl
in:
cancerType: cancerType
mrnaAlg: mrna-alg
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -20,14 +20,14 @@ inputs:

steps:
download-mrna:
run: ../mRNAData/mrna-data-cwl-tool.cwl
run: ../../mRNAData/mrna-data-cwl-tool.cwl
in:
cancerType: cancerType
sampleType: sampleType
out:
[matrix]
run-deconv:
run: run-deconv.cwl
run: ../run-deconv.cwl
in:
matrix: download-mrna/matrix
signature: signature
Expand Down
27 changes: 27 additions & 0 deletions simulatedData/map-sig-tool.cwl
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
label: map-sig-tool
id: map-sig-tool
cwlVersion: v1.2
class: CommandLineTool
baseCommand: Rscript

arguments:
- /bin/mapSimDataMatrices.R

requirements:
- class: DockerRequirement
dockerPull: tumodeconv/sim-data


inputs:
deconv-matrix:
type: File
inputBinding:
position: 1
sig-matrix:
type: File
inputBinding:
position: 2
cell-matrix:
type: File
inputBinding:
position: 3

0 comments on commit a11b307

Please sign in to comment.