ãöîáø 13, 2018

DESCRIPTION

Package: PRODIGY
Title: Personalized prioritization of driver genes
Version: 1.0
Authors: Gal Dinstag, Ron Shamir
Reference: Dinstag, G. & Shamir R. PRODIGY: personalized prioritization of driver genes. bioRxiv (2018)  
Maintainer: Gal Dinstag <galdinstag@mail.tau.ac.il>
Description: PRODIGY analyzes the expression and SNV profiles of a patient along with data on known pathways and protein-protein interactions. PRODIGY quantifies the impact of each mutated gene on every transcriptionally deregulated pathway using the prize collecting Steiner tree model. Mutated genes are ranked by their aggregated impact on all deregulated pathways.
Depends: R (>= 3.4.3)
License: GPL
Encoding: UTF-8
LazyData: true
RoxygenNote: 6.1.1```


# `analyze_PRODIGY_results`

analyze_PRODIGY_results


## Description

This function analyzes PRODIGY's results (influence score matrices) to rank individual driver genes for a group of patients.


## Usage

```r
analyze_PRODIGY_results(all_patients_scores)

Arguments

Argument	Description
`all_patients_scores`	Either a list of influence scores matrices or a single matrix received by the PRODIGY algorithm.

Value

A list of ranked genes per sample.

Examples

results = analyze_PRODIGY_results(all_patients_scores)

`check_performances`

check_performances

Description

This function calculates mean precision, recall and F1 of the results provided by PRODIGY against a gold standard driver list

Usage

check_performances(ranked_genes_lists, snv_matrix, gold_standard_drivers)

Arguments

Argument	Description
`ranked_genes_lists`	A named list of ranked drivers for each patient.
`snv_matrix`	A binary matrix with genes in rows and patients on columns. 1=mutation. All genes must be contained in the global PPI network.
`gold_standard_drivers`	A list of known driver genes to be used as gold standard for validation.

Examples

data(gold_standard_drivers)
check_performances(results,snv_matrix,gold_standard_drivers)

`get_DEGs`

get_DEGs

Description

This function calculates differentially expressed genes (DEGs) using DESeq2 for a group of patients. DEGs calculation can take a while so it is recommended to make this analysis as a pre-process

Usage

get_DEGs(expression_matrix, samples, sample_origins = NULL)

Arguments

Argument	Description
`expression_matrix`	A read count matrix with genes in rows and patients on columns. All genes must be contained in the global PPI network.
`samples`	The sample labels as they appear in the expression matrix.
`sample_origins`	a vector that contains two optional values ("tumor","normal") corresponds to the tissues from which each column in expression_matrix was derived. This vector is utilized for differential expression analysis. If no vector is specified, the sample names of expression_matrix are assumed to be in TCGA format where last two digits correspond to sample type: "01"= solid tumor and "11"= normal.

Value

A named list of DEGs per sample.

References

Love, M. I., Huber, W. & Anders, S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 15, 1-21 (2014).

Examples

data(COAD_Expression)
sample_origins = rep("tumor",ncol(expression_matrix))
sample_origins[substr(colnames(expression_matrix),nchar(colnames(expression_matrix)[1])-1,nchar(colnames(expression_matrix)[1]))=="11"] = "normal"
expression_matrix = expression_matrix[which(rownames(expression_matrix) %in% unique(c(network[,1],network[,2]))),]
DEGs = get_DEGs(expression_matrix,samples,sample_origins=NULL)

`PRODIGY`

PRODIGY

Description

This function runs the PRODIGY algorithm for a single patient.

Usage

PRODIGY(snv_matrix, expression_matrix, network = NULL, sample,
  diff_genes = NULL, alpha = 0.05, pathwayDB = "reactome",
  num_of_cores = 1, sample_origins = NULL, write_results = F,
  results_folder = "./")

Arguments

Argument	Description
`snv_matrix`	A binary matrix with genes in rows and patients on columns. 1=mutation. All genes must be contained in the global PPI network.
`expression_matrix`	A read count matrix with genes in rows and patients on columns. All genes must be contained in the global PPI network.
`network`	The global PPI network. Columns describe the source protein, destination protein and interaction score respectively. The network is considered as undirected.
`sample`	The sample label as appears in the SNV and expression matrices.
`diff_genes`	A vector of the sample's differentially expressed genes (with gene names). All genes must be contained in the global PPI network.
`alpha`	the penalty exponent.
`pathwayDB`	The pathway DB name from which curated pathways are taken. Could be one of three built in reservoirs ("reactome","kegg","nci").
`num_of_cores`	The number of CPU cores to be used by the influence scores calculation step.
`sample_origins`	A vector that contains two optional values ("tumor","normal") corresponds to the tissues from which each column in expression_matrix was derived. This vector is utilized for differential expression analysis. If no vector is specified, the sample names of expression_matrix are assumed to be in TCGA format where last two digits correspond to sample type: "01"= solid tumor and "11"= normal.
`write_results`	Should the results be written to text files?
`results_folder`	Location for resulting influence matrices storage (if write_results = T)

Value

A matrix of influence scores for every mutation and every enriched pathway.

References

Love, M. I., Huber, W. & Anders, S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 15, 1-21 (2014). Gabriele Sales, Enrica Calura and Chiara Romualdi, graphite: GRAPH Interaction from pathway Topological Environment (2017). Gillespie, M., Vastrik, I., Eustachio, P. D., Schmidt, E. & Bono, B. De. Reactome?: a knowledgebase of biological pathways. Nucleic Acids Res. 33, 428-432 (2005). Schaefer, C. F. et al. PID: The pathway interaction database. Nucleic Acids Res. 37, 674-679 (2009). Ogata, H. et al. KEGG: Kyoto encyclopedia of genes and genomes. Nucleic Acids Res. 27, 29-34 (1999).

Examples

# Load SNV+expression data from TCGA
data(COAD_SNV)
data(COAD_Expression)
# Load STRING network data
data(STRING_network)
network = STRING_network
sample = intersect(colnames(expression_matrix),colnames(snv_matrix))[1]
# Identify sample origins (tumor or normal)
sample_origins = rep("tumor",ncol(expression_matrix))
sample_origins[substr(colnames(expression_matrix),nchar(colnames(expression_matrix)[1])-1,nchar(colnames(expression_matrix)[1]))=="11"] = "normal"
res = PRODIGY<-function(snv_matrix,expression_matrix,network=network,sample,diff_genes=NULL,alpha=0.05,pathwayDB="reactome",num_of_cores=1,sample_origins = sample_origins)

`PRODIGY_cohort`

PRODIGY_cohort

Description

This is a wrapper that runs PRODIGY for a cohort of patients. It could take a long time to finish a group of patints, hence it is advised to output PRODIGY's results into files using the write_results parameter. It is also advised to use as much cores as possible using the num_of_cores parameter

Usage

PRODIGY_cohort(snv_matrix, expression_matrix, network = NULL,
  samples = NULL, DEGs = NULL, alpha = 0.05,
  pathwayDB = "reactome", num_of_cores = 1, sample_origins = NULL,
  write_results = F, results_folder = "./")

Arguments

Argument	Description
`snv_matrix`	A binary matrix with genes in rows and patients on columns. 1=mutation. All genes must be contained in the global PPI network.
`expression_matrix`	A read count matrix with genes in rows and patients on columns. All genes must be contained in the global PPI network.
`network`	The global PPI network. Columns describe the source protein, destination protein and interaction score respectively. The network is considered as undirected.
`DEGs`	Named list of differentially expressed genes for every sample. All genes must be contained in the global PPI network.
`alpha`	The penalty exponent.
`pathwayDB`	The pathway DB name from which curated pathways are taken. Could be one of three built in reservoirs ("reactome","kegg","nci").
`num_of_cores`	The number of CPU cores to be used by the influence scores calculation step.
`sample_origins`	A vector that contains two optional values ("tumor","normal") corresponds to the tissues from which each column in expression_matrix was derived. This vector is utilized for differential expression analysis. If no vector is specified, the sample names of expression_matrix are assumed to be in TCGA format where last two digits correspond to sample type: "01"= solid tumor and "11"= normal.
`write_results`	Should the results be written to text files?
`results_folder`	Location for resulting influence matrices storage (if write_results = T)
`sample`	The sample labels as appears in the SNV and expression matrices.

Value

A list of influence scores matrices.

References

Love, M. I., Huber, W. & Anders, S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 15, 1-21 (2014). Gabriele Sales, Enrica Calura and Chiara Romualdi, graphite: GRAPH Interaction from pathway Topological Environment (2017).

Examples

data(COAD_SNV)
data(COAD_Expression)
# Load STRING network data
data(STRING_network)
network = STRING_network
# Take samples for which SNV and expression is available
samples = intersect(colnames(expression_matrix),colnames(snv_matrix))[1:5]
# Get differentially expressed genes (DEGs) for all samples
expression_matrix = expression_matrix[which(rownames(expression_matrix) %in% unique(c(network[,1],network[,2]))),]
DEGs = get_DEGs(expression_matrix,samples,sample_origins=NULL)
# Identify sample origins (tumor or normal)
sample_origins = rep("tumor",ncol(expression_matrix))
sample_origins[substr(colnames(expression_matrix),nchar(colnames(expression_matrix)[1])-1,nchar(colnames(expression_matrix)[1]))=="11"] = "normal"
# Run PRODIGY
all_patients_scores = PRODIGY_cohort(snv_matrix,expression_matrix,network=network,samples=samples,DEGs=DEGs,alpha=0.05,pathwayDB="reactome",num_of_cores=1,sample_origins=sample_origins)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reference_Manual_PRODIGY.md

Reference_Manual_PRODIGY.md

DESCRIPTION

Arguments

Value

Examples

`check_performances`

Description

Usage

Arguments

Examples

`get_DEGs`

Description

Usage

Arguments

Value

References

Examples

`PRODIGY`

Description

Usage

Arguments

Value

References

Examples

`PRODIGY_cohort`

Description

Usage

Arguments

Value

References

Examples

Files

Reference_Manual_PRODIGY.md

Latest commit

History

Reference_Manual_PRODIGY.md

File metadata and controls

DESCRIPTION

Arguments

Value

Examples

check_performances

Description

Usage

Arguments

Examples

get_DEGs

Description

Usage

Arguments

Value

References

Examples

PRODIGY

Description

Usage

Arguments

Value

References

Examples

PRODIGY_cohort

Description

Usage

Arguments

Value

References

Examples

`check_performances`

`get_DEGs`

`PRODIGY`

`PRODIGY_cohort`