Skip to content

Arcadia-Science/2024-unicellular-tracking

Repository files navigation

2024-unicellular-tracking

run with conda Arcadia Pub

tracked cells

Purpose

This repository accompanies the pub "A high-throughput imaging assay for phenotyping unicellular swimming". Its main purpose is for detecting and tracking unicellular organisms in brightfield time-lapse microscopy data at scale.

Installation and setup

This repository uses conda to manage software environments and installations. If you do not already have conda installed, you can find operating system-specific instructions for installing miniconda here. After installing conda, navigate to a directory where you would like to clone the repository to, and run the following commands to create the pipeline run environment.

git clone https://github.com/Arcadia-Science/2024-unicellular-tracking.git
cd 2024-unicellular-tracking
conda env create -n unicellular-tracking --file envs/dev.yml
conda activate unicellular-tracking
pip install -e .

If the installation was successful, the below command will return without error.

python -c "import swimtracker"

Overview

Description of the folder structure

This repository is organized into the following top-level directories.

  • btrack_config: contains a YAML file for configuring btrack configuration.
  • data: CSV files containing summary motility metrics from measured cell trajectories.
  • envs: contains a conda environment file that lists the packages and dependencies used for creating the conda environment.
  • notebooks: Collection of Jupyter notebooks for analyzing motility data, including the code used to generate Figures 4–7 in the pub.
  • resources: Documentation and files related to the automated microscopy acquisitions. Also includes static files such as PNGs and GIFs used for documentation within the repository.
  • results: A collection of SVG files output by the Jupyter notebooks for generating Figures 4–7 in the pub.
  • src/swimtracker: Source code, scripts, and tests comprising the key functionality of the repository including parallelized image processing, cell tracking, and statistical analysis.

Methods

Cell tracking

Cell tracking was performed by running the track_cells.py script (see the "Scripts" section below for more context) on the full dataset of raw brightfield microscopy time lapses available at https://doi.org/10.6019/S-BIAD1298. As described in data/README.md, this dataset is comprised of Chlamydomonas reinhardtii cells swimming in either agar microchamber pools (AMID-04_CC-124_pools) or microtiter plates (AMID-05_CC-124_wells). The following command was run to track cells in microchamber pools:

python src/swimtracker/scripts/track_cells.py \
    AMID-04_CC-124_pools/S1-Cr3-T/ \
    --vessel "pools" \
    --pool-radius 50 \
    --use-dask

The same command was repeated for the next three subdirectories (S2-Cr3-M, S3-Cr4-T, and S4-Cr4-M) by substituting in the name of the subdirectory to the first argument. For tracking cells in microtiter plates, the same script was run with the following optional arguments,

python src/swimtracker/scripts/track_cells.py \
    AMID-05_CC-124_wells/ \
    --vessel "384-well plate"
    --use-dask

Generating figures

The statistical analysis was done through a series of Jupyter notebooks in which the figures of the pub were also created. The list below maps each analysis and figure to its notebook.

Compute Specifications

Cell tracking was done on a Supermicro X12SPA-TF 64L running Ubuntu 22.04.1 with 512 GB RAM, 64 cores, and a 2 TB SSD.

The notebooks for statistical analysis were run on an Apple MacBook Pro with an Apple M3 Max chip running macOS Sonoma version 14.5 with 36 GB RAM, 14 cores, and 1TB SSD.

Data

The full dataset underlying the pub is 355 GB and thus has been uploaded to the BioImage Archive (DOI: 10.6019/S-BIAD1298). To enable users to perform the analysis related to motility metrics, this repository provides CSV files containing summary motility statistics. More information is provided in data/README.md.

Scripts

There are four scripts located in src/swimtracker/scripts, the first three of which are for processing biological image data, while the fourth was only run once to prepare the dataset for uploading to the BioImage Archive.

  • track_cells.py: Track cells in raw brightfield time-lapse microscopy data.
  • make_movies_of_pools.py: Render an animation of tracked cells in agar microchamber pools (after cell tracking).
  • make_movies_of_wells.py: Render an animation of tracked cells in a microtiter plate (after cell tracking).
  • generate_bioimage_archive_file_lists.py: Generate the lists of files needed for the BioImage Archive upload. (No longer intended to be used.)

All scripts are configured with click such that

python src/swimtracker/scripts/{script}.py --help

will display a help message that gives a description of what the script does as well as the arguments it accepts and their default values. The three scripts for processing biological image data also accept a --glob argument that can be used to filter the set of files to process. For example, to track cells from only one row of wells from a plate, one could run the command.

python src/swimtracker/scripts/track_cells.py \
    /path/to/directory/of/nd2/files/ \
    --glob "WellB*.nd2"

For more information on glob patterns, check out the official Python documentation for the pathlib library. The default glob pattern is "*.nd2".

Cell tracking

track_cells.py executes cell tracking on a batch of time-lapse microscopy data. For accurate cell tracking, the script will first segment cells within each time lapse. The segmentation algorithm is effectively just background subtraction and intensity thresholding; see the section on "Tracking cells and motility phenotyping" of the pub for details. Cell tracking is done using btrack. The output for each ND2 file is a TIFF file of the segmented timelapse and a CSV file of the motility data that contains every cell detected in the segmentation.

Microscopy data for the pub is comprised of cells swimming inside one of two different types of "vessels": either agar microchamber pools or microtiter plates. By default, the script expects cells to be swimming in a microtiter plate, but the --vessel argument can be used to change the expected vessel type as shown in the examples below. Regardless of the vessel type, the expected input is more or less the same: a ~20 sec timelapse of brightfield microscopy data stored as a ND2 file in which there are clearly unicellular organisms swimming around. There are no constraints on the duration, dimensions, frame rate, or pixel size of the timelapse, but the code has thus far predominantly been tested on 20 sec timelapses with dimensions around (400, 1200, 1200) T, Y, X acquired at 20–30 frames per second. Most cell tracking has been performed on different species and strains of Chlamydomonas, hence the default of 6 µm for the min_cell_diameter_um parameter. This parameter should be increased or decreased based on the size of the organism recorded.

To track cells in time-lapse videos of 384- or 1536-well plates, parallelized by dask:

python src/swimtracker/scripts/track_cells.py \
    /path/to/directory/of/nd2/files/ \
    --output-directory /path/to/writeable/storage/location/ \
    --use-dask

To track cells in time-lapse data of 100 µm diameter agar microchamber pools, using 6 cores in parallel:

python src/swimtracker/scripts/track_cells.py \
    /path/to/directory/of/nd2/files/ \
    --output-directory /path/to/writeable/storage/location/ \
    --pool-radius 50 \
    --num-workers 6

Note that in the above examples, --output-directory is an optional argument. If not provided, output will be written to a directory named processed within the input directory (first argument). If {input-directory}/processed/ already exists, files may be overwritten.

Making movies of tracked cells

To provide some sort of visual confirmation that the segmentation and cell tracking was done successfully, there are also scripts for adding animations of cell trajectories to the tracked cells using the napari plugin napari-animation. Here the choice for which script to run depends on the type of vessel used in the experiment.

To create animations of tracked cells in 384- or 1536-well plates at 20 fps:

python src/swimtracker/scripts/make_movies_of_wells.py \
    /path/to/directory/of/nd2/files/ \
    --framerate 20
    --output-directory /path/to/writeable/storage/location/

To create animations of tracked cells in agar microchamber pools at 30 fps:

python src/swimtracker/scripts/make_movies_of_pools.py \
    /path/to/directory/of/nd2/files/ \
    --framerate 30
    --output-directory /path/to/writeable/storage/location/

The output for each ND2 file is a MP4 file that is a compressed, contrast-enhanced version of the timelapse with cell trajectories animated in a variety of colors corresponding to the trajectory ID. Note that in the above examples, --framerate and --output-directory are both optional arguments. The default frame rate is 30 fps, while the default output directory is a directory named processed within the input directory (first argument). If {input-directory}/processed/ already exists, files may be overwritten.

Contributing

See how we recognize feedback and contributions to our code.