Skip to content

ROSeq - A rank based approach to modeling gene expression with filtered and normalized read count matrix. Takes in the complete filtered and normalized read count matrix, the location of the two sub-populations and the number of cores to be used.

License

Notifications You must be signed in to change notification settings

krishan57gupta/ROSeq

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ROSeq

Modeling expression ranks for noise-tolerant differential expression analysis of scRNA-Seq data

Introduction

ROSeq - A rank based approach to modeling gene expression with filtered and normalized read count matrix. ROSeq takes filtered and normalized read matrix and cell-annotation/condition as input and determines the differentially expressed genes between the contrasting groups of single cells. One of the input parameters is the number of cores to be used.

Installation

The developer’s version of the R package can be installed with the following R commands:

if (!requireNamespace("BiocManager", quietly = TRUE))
    install.packages("BiocManager")

# The following initializes usage of Bioc devel
BiocManager::install(version='devel')

BiocManager::install("ROSeq")

The github’s version of the R package can be installed with the following R commands:

library(devtools)
install_github('krishan57gupta/ROSeq')

Vignette tutorial

This vignette uses the Tung dataset, which is already inbuilt in the package, to demonstrate a standard pipeline.

Example

Libraries need to be loaded before running.

library(ROSeq)
library(edgeR)
#> Loading required package: limma
library(limma)

Loading tung dataset

samples<-list()
samples$count<-ROSeq::L_Tung_single$NA19098_NA19101_count
samples$group<-ROSeq::L_Tung_single$NA19098_NA19101_group
samples$count[1:5,1:5]
#>                 NA19098.r1.A01 NA19098.r1.A02 NA19098.r1.A03 NA19098.r1.A04
#> ENSG00000237683              0              0              0              1
#> ENSG00000187634              0              0              0              0
#> ENSG00000188976              3              6              1              3
#> ENSG00000187961              0              0              0              0
#> ENSG00000187583              0              0              0              0
#>                 NA19098.r1.A05
#> ENSG00000237683              0
#> ENSG00000187634              0
#> ENSG00000188976              4
#> ENSG00000187961              0
#> ENSG00000187583              0

Data Preprocessing:

Cells and genes filtering then voom transformation after TMM normalization

Below commands can be used for Cell/gene filtering, TMM normalization and voom transformation. The user is free to use an alternative preprocessing strategy while using different filtering/normalization methods.

gene_names<-rownames(samples$count)
samples$count<-apply(samples$count,2,function(x) as.numeric(x))
rownames(samples$count)<-gene_names
samples$count<-samples$count[,colSums(samples$count> 0) > 2000]
gkeep<-apply(samples$count,1,function(x) sum(x>2)>=3)
samples$count<-samples$count[gkeep,]
samples$count<-limma::voom(ROSeq::TMMnormalization(samples$count))

ROSeq analysis.

Input: gene expression matrix with genes in rows and cells in columns. Condition/group annotation of cells also need to be supplied. User can set numCores based the hardware specifications in her computer.

output<-ROSeq(countData=samples$count$E, condition = samples$group, numCores=1)

Showing results are in the form of pVals and pAdj

p_Vals : p_value (unadjusted)
p_Adj : Adjusted p-value, based on FDR method
output[1:5,]
#>                     pVals      pAdj
#> ENSG00000237683 0.6741425 0.9321651
#> ENSG00000188976 0.7484244 0.9426495
#> ENSG00000187608 0.2282451 0.8481636
#> ENSG00000188157 0.5138812 0.9082800
#> ENSG00000131591 0.1235577 0.7438811

About

ROSeq - A rank based approach to modeling gene expression with filtered and normalized read count matrix. Takes in the complete filtered and normalized read count matrix, the location of the two sub-populations and the number of cores to be used.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages