-
Notifications
You must be signed in to change notification settings - Fork 0
Single cell RNA Sequencing Analysis
Used for filtering, normalization, scaling, integration (optionally), and clustering of single or aggregated single-cell RNA-Seq datasets
The main functional blocks of sc-rna-analyze-wf.cwl workflow are shown below. For a detailed workflow structure refer to CWL Viewer.
First, make sure you have cwltool, Docker, git
and wget
tools installed, then proceed to the following steps.
❗ With the minimum required Docker configuration (4 CPU and 20GB of RAM) the approximate running time is up to 1 h.
- Create a temporary folder and clone the current repository.
mkdir sc_rna cd sc_rna git clone https://github.com/Barski-lab/sc-seq-analysis.git
- Create a folder for input data. Download required input files from the Figshare either using a web browser or commands below.
mkdir inputs cd inputs wget -O filtered_feature_bc_matrix.tar.gz https://figshare.com/ndownloader/files/34819513 wget -O aggregation.csv https://figshare.com/ndownloader/files/34819516 wget -O condition.csv https://figshare.com/ndownloader/files/34819519 wget -O mouse_cell_cycle_genes.csv https://figshare.com/ndownloader/files/34822054
- Copy the job definition file into the
inputs
folder.cp ../sc-seq-analysis/jobs/sc-rna-analyze-wf.yaml .
- Create a folder for workflow outputs and execute
sc-rna-analyze-wf.cwl
workflow withsc-rna-analyze-wf.yaml
job definition file.cd .. mkdir outputs cd outputs cwltool ../sc-seq-analysis/workflows/sc-rna-analyze-wf.cwl ../inputs/sc-rna-analyze-wf.yaml
Expected outputs (some of the plots and files are omitted)
Note, as the Docker image is being frequently updated, your outputs may not exactly correspond to the plots below. In order to reproduce exactly the same outputs, switch to 4819746 commit.
Clustering results can be also evaluated interactively in UCSC Cell Browser using RangeHTTPServer or any other simple HTTP server.
cd html_data
python3 -m RangeHTTPServer # open http://localhost:8000/
Example of UCSC Cell Browser window.
Step 1. QC metrics and the results of low-quality cells removal.
Before low-quality cells removal | After low-quality cells removal |
---|---|
Step 2. Dimensionality reduction and evaluating confounding sources of variation.
Step 3. Cluster analysis and gene markers identification.
Example of the table with identified gene markers (top 10 rows)
resolution | cluster | feature | p_val | avg_log2FC | pct.1 | pct.2 | p_val_adj |
---|---|---|---|---|---|---|---|
0.5 | 0 | Dpt | 0 | 3.06828808 | 0.941 | 0.176 | 0 |
0.5 | 0 | Col3a1 | 0 | 2.57320598 | 0.998 | 0.814 | 0 |
0.5 | 0 | Lum | 0 | 2.48541392 | 0.951 | 0.318 | 0 |
0.5 | 0 | C4b | 0 | 2.44687844 | 0.962 | 0.444 | 0 |
0.5 | 0 | Fbn1 | 0 | 2.42990848 | 0.978 | 0.524 | 0 |
0.5 | 0 | Clec3b | 0 | 2.4144368 | 0.767 | 0.134 | 0 |
0.5 | 0 | Col14a1 | 0 | 2.39398771 | 0.897 | 0.167 | 0 |
0.5 | 0 | Sfrp1 | 0 | 2.37871487 | 0.852 | 0.304 | 0 |
0.5 | 0 | Gsn | 0 | 2.37083093 | 0.946 | 0.602 | 0 |