Code and resources related to the olfactory epithelial HBC stem cell differentiation project (oeHBCdiff)
Below are the R scripts for analyzing the single-cell RNA-seq data from differentiating HBC stem cells of the olfactory epithelium, presented in the following manuscript:
Fletcher RB*, Das D*, Gadye L, Street KN, Baudhuin A, Wagner A, Cole MB, Flores Q, Choi YG, Yosef N, Purdom E, Dudoit S, Risso D, Ngai J. Deconstructing Olfactory Stem Cell Trajectories at Single Cell Resolution. Cell Stem Cell (2017). https://doi.org/10.1016/j.stem.2017.04.003 (* co-first authors)
The data are available on GEO in GSE95601.
The repository currently has scripts that take as input Expression Set data and perform a series of computations, interspersed with visualizations. First, the data are filtered for poor quality cells and less informative genes. The data are normalized, and biological contaminants and known doublets (based on co-expression of differentiated cell markers) are removed. Then, the data are re-filtered and re-normalized.
After filtering and normalization, we clustered the data using clusterExperiment, performed developmental ordering and inferred lineage trajectories and branching with slingshot. For each lineage, differentially expressed genes were identified and then clustered to reveal coordinated gene expression. We used Gene Set Enrichment Analysis to infer pathways regulating cell fates and transitions.
We created a number of visualizations based on clustering, experimental condition, and developmental order. We displayed coordinated and correlated differentially expressed genes including transcription factors, as well as a set of cell cycle genes and selected regulators of cell fate transitions along each lineage. The olfactory receptors (OR) and factors associated with OR regulation were plotted along the neuronal lineage. We also presented the top enriched gene sets for each cell cluster.
In project directory, run mkdir -p output/{clust,data,gClust,romer,viz,DE,EDA}/oeHBCdiff
, and add new directories to .gitignore
. Place the scripts in the 'scripts' directory and the initial eSet 'data' in the data directory.
oeHBCdiff_1_filt_norm.sh
performs the following analyses, by calling various R scripts (given in parentheses):
- Filtering based on technical attributes (
oeHBCdiff_filtering.R
) - Normalization using SCONE (
oeHBCdiff_norm.R
)
oeHBCdiff_2_renorm_clust_devO_DE.sh
performs the following analyses, by calling various R scripts (given in parentheses):
- Create SummarizedExperiment object for desired normalization (
oeHBCdiff_makeSE.R
) - Identification of biological contaminants (
oeHBCdiff_exclude.R
) - Re-filtering after removal of contaminants (
oeHBCdiff_filtering.R
) - Re-normalization after removal of contaminants (
oeHBCdiff_norm.R
) - Create SummarizedExperiment object (
oeHBCdiff_makeSE.R
)
oeHBCdiff_2_renorm_clust_devO_DE.sh
performs the following analyses, by calling various R scripts (given in parentheses):
- Clustering using clusterExperiment (
oeHBCdiff_clust.R
) - Developmental ordering with slingshot (
oeHBCdiff_slingshot.Rmd
) - Differential gene expression using limma, along each lineage (
oeHBCdiff_de.Rmd
) - Clustering of differenitally expressed genes along each lineage (
oeHBCdiff_geneClustering.Rmd
) - Preparation of gene sets for Gene Set Enrichment Analysis (GSEA;
oeHBCdiff_GSEAprep.Rmd
) - GSEA based on cell clustering using limma romer (
oeHBCdiff_romerGSEA.R
)
oeHBCdiff_3_viz.sh
performs the following analyses, by calling various R scripts (given in parentheses):
- Visualizations based on cell clustering (heatmap of marker genes, tSNE plots, PCA pairs plot, cluster & experimental condition bubble plots;
oeHBCdiff_clusterPlots.Rmd
) - Visualizations incorporating developmental ordering (3D-PCA plots, dot plots;
oeHBCdiff_devorderplots.Rmd
) - Heatmaps of cell cycle genes in the neuronal and sustentacular lineages (
oeHBCdiff_cellCycle.Rmd
) - Heatmaps of differentially expressed transcription factors by lineage (
oeHBCdiff_tf_hm.R
) - Transcription factor co-expression, network analysis, and visualizations (
oeHBCdiff_tf.Rmd
) - Heatmaps of gene clustering in developmental order (
oeHBCdiff_geneClustHeatmaps.Rmd
) - Plots of individual or pairs of genes in developmental order (
oeHBCdiff_genePlots.Rmd
) - Barplots of GSEA, showing top 100 enriched gene sets per cluster (
oeHBCdiff_GSEAplots.Rmd
) - Volcano plots of differentially expressed genes (
oeHBC_volcano.R
) - Olfactory Receptor (OR) gene and OR regulation associated gene expression plots (
oeHBCdiff_OR.R
)
PathVisio (Version 3.2.4, pathvisio.org) was used to display differential gene expression for Wnt pathway members expressed in the HBCs. To reproduce this plot, download and install PathVisio and follow the instructions below.
Download the Mm_Derby_Ensembl_85.bridge gene reference data via the downloads link at pathvisio.org. The Mm_Wnt_Signaling_Pathway_and_Pluripotency_WP723_89312.gpml pathway from the wikipathways_Mus_musculus_Curation-AnalysisCollection (available from the pathvisio.org download link) was used as a starting point. This pathway was modified to only include genes present in our data set after gene filtering. Then select genes were removed and added to focus the pathway on canonical Wnt signaling and to include relevant factors for our experiment. The genes in the pathway were colored by differential expression (log2FC) for the HBCs (cluster 1) relative to all other clusters.
The input DE data (HBConeVsAllDE_WntSigPath.txt) and the modified pathway (WntPathway.gpml) are in the ref directory of this repository. To produce the diagram, load PathVisio; enter the File menu and open the WntPathway.gpml pathway file; then enter the Data menu and select "Select Gene Database" and load the Mm_Derby_Ensembl_85 data; enter the Data menu and select "Import expression data" and load HBConeVsAllDE_WntSigPath.txt. After the pathway is rendered, enter the Data menu, select "visualization options", select "Text Label", "Expression as color", and "logFC" and then modify the default color palette to match the final output (blue to red) presented in the paper.
- SCONE (normalization), Version 0.0.7: https://github.com/YosefLab/scone
- clusterExperiment (clustering), Version 0.99.3-9001: http://bioconductor.org/packages/release/bioc/html/clusterExperiment.html, https://github.com/epurdom/clusterExperiment
- Slingshot (lineage trajectory algorithm), Version 0.0.0.9005: https://github.com/kstreet13/slingshot