-
Notifications
You must be signed in to change notification settings - Fork 0
Single cell Multiome ATAC Seq and RNA Seq Analysis
Used for filtering, normalization, scaling, integration (optionally), and clustering for single or aggregated single-cell Multiome ATAC and RNA-Seq datasets
The main functional blocks of sc-multiome-analyze-wf.cwl workflow are shown below. For a detailed workflow structure refer to CWL Viewer.
In this example we will run the analysis of Multiome ATAC and RNA sequencing data described in the
WNN analysis of 10x Multiome, RNA + ATAC. First, make sure you have cwltool, Docker, git
, gzip
and wget
tools installed, then proceed to the steps below.
❗ With the minimum required Docker configuration (4 CPU and 32GB of RAM) the approximate running time is up to 6 h.
- Create a temporary folder and clone the current repository.
mkdir sc_multiome cd sc_multiome git clone https://github.com/Barski-lab/sc-seq-analysis.git
- Create a folder for input data. Download required input files using commands below.
mkdir inputs cd inputs wget -O pbmc_granulocyte_sorted_10k_filtered_feature_bc_matrix.tar.gz https://cf.10xgenomics.com/samples/cell-arc/1.0.0/pbmc_granulocyte_sorted_10k/pbmc_granulocyte_sorted_10k_filtered_feature_bc_matrix.tar.gz wget -O pbmc_granulocyte_sorted_10k_atac_fragments.tsv.gz https://cf.10xgenomics.com/samples/cell-arc/1.0.0/pbmc_granulocyte_sorted_10k/pbmc_granulocyte_sorted_10k_atac_fragments.tsv.gz wget -O pbmc_granulocyte_sorted_10k_atac_fragments.tsv.gz.tbi https://cf.10xgenomics.com/samples/cell-arc/1.0.0/pbmc_granulocyte_sorted_10k/pbmc_granulocyte_sorted_10k_atac_fragments.tsv.gz.tbi wget -O gencode.v40.annotation.gtf.gz https://ftp.ebi.ac.uk/pub/databases/gencode/Gencode_human/release_40/gencode.v40.annotation.gtf.gz wget -O hg38-blacklist.v2.bed.gz https://raw.githubusercontent.com/Boyle-Lab/Blacklist/master/lists/hg38-blacklist.v2.bed.gz gzip -d hg38-blacklist.v2.bed.gz gzip -d gencode.v40.annotation.gtf.gz
- Copy the job definition file into the
inputs
folder.cp ../sc-seq-analysis/jobs/sc-multiome-analyze-wf.yaml .
- Create a folder for workflow outputs and execute
sc-multiome-analyze-wf.cwl
workflow withsc-multiome-analyze-wf.yaml
job definition file.cd .. mkdir outputs cd outputs cwltool ../sc-seq-analysis/workflows/sc-multiome-analyze-wf.cwl ../inputs/sc-multiome-analyze-wf.yaml
Expected outputs (some of the plots and files are omitted)
Note, as we constantly improve our tools and update Dockerfile frequently, your outputs can be slightly different from the plots below. In order to reproduce exactly the same results, switch to 4819746 commit.
Clustering results can be also evaluated interactively in UCSC Cell Browser using RangeHTTPServer or any other simple HTTP server.
cd html_data
python3 -m RangeHTTPServer # open http://localhost:8000/
Example of UCSC Cell Browser window.
Step 1. QC metrics and the results of low-quality cells removal.
Before low-quality cells removal | After low-quality cells removal |
---|---|
Step 2. Dimensionality reduction and evaluating confounding sources of variation for RNA assay.
Step 3. Dimensionality reduction and evaluating confounding sources of variation for ATAC assay.
Step 4. Cluster analysis of multimodal data, gene markers and differentially accessible peaks identification.
Example of the table with identified gene markers (top 10 rows)
resolution | cluster | feature | p_val | avg_log2FC | pct.1 | pct.2 | p_val_adj |
---|---|---|---|---|---|---|---|
0.5 | 0 | NAMPT | 0 | 2.96929022 | 0.98 | 0.257 | 0 |
0.5 | 0 | PLXDC2 | 0 | 2.9042261 | 0.994 | 0.194 | 0 |
0.5 | 0 | VCAN | 0 | 2.83310433 | 0.938 | 0.22 | 0 |
0.5 | 0 | LRMDA | 0 | 2.59552784 | 0.973 | 0.166 | 0 |
0.5 | 0 | AC020916.1 | 0 | 2.55713548 | 0.909 | 0.16 | 0 |
0.5 | 0 | SLC8A1 | 0 | 2.532127 | 0.973 | 0.182 | 0 |
0.5 | 0 | SAT1 | 0 | 2.43424006 | 0.996 | 0.505 | 0 |
0.5 | 0 | ACSL1 | 0 | 2.38210986 | 0.802 | 0.151 | 0 |
0.5 | 0 | ANXA1 | 0 | 2.38054138 | 0.92 | 0.465 | 0 |
Example of the table with identified differentially accessible peaks (top 10 rows)
resolution | cluster | feature | p_val | avg_log2FC | pct.1 | pct.2 | p_val_adj |
---|---|---|---|---|---|---|---|
0.5 | 0 | chr6-44057321-44060655 | 0 | 0.83326212 | 0.717 | 0.092 | 0 |
0.5 | 0 | chr6-41280331-41287503 | 0 | 0.80445886 | 0.755 | 0.13 | 0 |
0.5 | 0 | chr7-101716926-101719338 | 0 | 0.80173011 | 0.675 | 0.093 | 0 |
0.5 | 0 | chr22-38950570-38958424 | 0 | 0.79980587 | 0.803 | 0.146 | 0 |
0.5 | 0 | chr1-212484685-212489524 | 0 | 0.79791512 | 0.698 | 0.103 | 0 |
0.5 | 0 | chr19-4539429-4544501 | 0 | 0.78108779 | 0.659 | 0.1 | 0 |
0.5 | 0 | chr20-50269694-50277398 | 0 | 0.77624021 | 0.804 | 0.144 | 0 |
0.5 | 0 | chr20-1943201-1947850 | 0 | 0.77379333 | 0.707 | 0.114 | 0 |
0.5 | 0 | chr9-129776538-129778267 | 0 | 0.77122705 | 0.621 | 0.095 | 0 |