Skip to content

Commit

Permalink
Merge pull request #958 from sjspielman/sjspielman/hello-patch
Browse files Browse the repository at this point in the history
Small hello-clusters notebook cleanups
  • Loading branch information
sjspielman authored Dec 20, 2024
2 parents 77b6f77 + b53c6fa commit 22315fc
Show file tree
Hide file tree
Showing 2 changed files with 41 additions and 38 deletions.
24 changes: 11 additions & 13 deletions analyses/hello-clusters/01_perform-evaluate-clustering.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,8 @@
title: "Performing graph-based clustering with rOpenScPCA"
date: "`r Sys.Date()`"
author: "Data Lab"
output:
html_notebook:
output:
html_notebook:
toc: yes
toc_float: yes
df_print: paged
Expand Down Expand Up @@ -76,7 +76,6 @@ set.seed(2024)
## Read in and prepare data

To begin, we'll read in the `SingleCellExperiment` (SCE) object.
We'll also establish a corresponding processed Seurat object from its raw counts that we'll use for some examples.

```{r read data}
# Read the SCE file
Expand All @@ -94,7 +93,7 @@ pca_matrix <- reducedDim(sce, "PCA")

## Perform clustering

This section will show how to perform clustering with the function `rOpenScPCA::calculate_clusters()`.
This section will show how to perform clustering with the function `rOpenScPCA::calculate_clusters()`.

This function takes a PCA matrix with rownames representing unique cell ids (e.g., barcodes) as its primary argument.
By default it will calculate clusters using the following parameters:
Expand Down Expand Up @@ -152,7 +151,7 @@ cluster_results_df <- rOpenScPCA::calculate_clusters(

## Calculate QC metrics on clusters

This section demonstrates how to use several functions for evaluating cluster quality and reliability.
This section demonstrates how to use several functions for evaluating cluster quality and reliability.
It's important to note that a full evaluation of clustering results would compare these metrics across a set of clustering results, with the aim of identifying an optimal parameterization.

All functions presented in this section take the following required arguments:
Expand Down Expand Up @@ -236,7 +235,7 @@ ggplot(purity_results) +

### Cluster stability

Another approach to exploring cluster quality is how stable the clusters themselves are using bootstrapping.
Another approach to exploring cluster quality is how stable the clusters themselves are using bootstrapping.
Given a set of original clusters, we can compare the bootstrapped cluster identities to original ones using the Adjusted Rand Index (ARI), which measures the similarity of two data clusterings.
ARI ranges from -1 to 1, where:

Expand Down Expand Up @@ -276,7 +275,7 @@ ggplot(stability_results) +

#### Using non-default clustering parameters

When calculating bootstrap clusters, `rOpenScPCA::calculate_stability()` uses `rOpenScPCA::calculate_clusters()` with default parameters.
When calculating bootstrap clusters, `rOpenScPCA::calculate_stability()` uses `rOpenScPCA::calculate_clusters()` with default parameters.
If your original clusters were not calculated with these defaults, you should pass those customized values into this function as well to ensure a fair comparison between your original clusters and the bootstrap clusters.


Expand Down Expand Up @@ -331,7 +330,6 @@ If you are analyzing your data with a Seurat pipeline that includes calculating

To demonstrate this, we'll convert our SCE object to a Seurat using the function `rOpenScPCA::sce_to_seurat()`.
Then, we'll use a simple Seurat pipeline to obtain clusters.
<!-- TODO: We will want to reference this module for further documentation on this function: https://github.com/AlexsLemonade/OpenScPCA-analysis/issues/945 -->

```{r sce to seurat, message = FALSE}
# Convert the SCE to a Seurat object using rOpenScPCA
Expand Down Expand Up @@ -380,8 +378,8 @@ We do not recommend using `rOpenScPCA::calculate_stability()` on Seurat clusters

### Evaluating ScPCA clusters

ScPCA cell metadata already contains a column called `cluster` with results from an automated clustering.
These clusters were calculated using `bluster`, the same tool that `rOpenScPCA` uses.
ScPCA cell metadata already contains a column called `cluster` with results from an automated clustering.
These clusters were calculated using `bluster`, the same tool that `rOpenScPCA` uses.
The specifications used for this clustering are stored in the SCE object's metadata, as follows; note that all other clustering parameters were left at their default values.

* `metadata(sce)$cluster_algorithm`: The clustering algorithm used
Expand Down Expand Up @@ -446,7 +444,7 @@ scpca_stability_df <- rOpenScPCA::calculate_stability(
```


## Saving clustering results
## Saving clustering results

Results can either be directly exported as a TSV file (e.g., with `readr::write_tsv()`), or you can add the results into your SCE or Seurat object.
The subsequent examples will demonstrate saving the cluster assignments stored in `cluster_results_df$cluster` to an SCE and a Seurat object.
Expand All @@ -456,7 +454,7 @@ Objects from the ScPCA Portal already contain a column called `cluster` with res
These automatic clusters were not evaluated, and their parameters were not optimized for any given library.
To avoid ambiguity between the existing and new clustering results, we'll name the new column `ropenscpca_cluster`.

### Saving results to an SCE object
### Saving results to an SCE object

We can add columns to an SCE object's `colData` table by directly creating a column in the object with `$`.
Before we do so, we'll confirm that the clusters are in the same order as the SCE object by comparing cell ids:
Expand All @@ -473,7 +471,7 @@ all.equal(
sce$ropenscpca_cluster <- cluster_results_df$cluster
```

### Saving results to a Seurat object
### Saving results to a Seurat object


We can add columns to an Seurat object's cell metadata table by directly creating a column in the object with `$` (note that you can also use the Seurat function `AddMetaData()`).
Expand Down
55 changes: 30 additions & 25 deletions analyses/hello-clusters/01_perform-evaluate-clustering.nb.html

Large diffs are not rendered by default.

0 comments on commit 22315fc

Please sign in to comment.