This is a project to align TMA cores coming from different consecutive TMA slides stained for different proteins using IHC.
In this project we cover the following topics:
Unmixing | Registration |
Segmentation | protein co-expression |
Image analysis | Software |
For insight on the usage of the code see the following sections. This is not an end-to-end software but rather a collection of scripts that require a basic setup of data. We include inside this project some libraries, like Alpha-AMD for registration and our utility functions to deal with DZI tiled images and perform unmixing and other preprocessing operations. We also provide 2 DZI pyramids if you want to try out this code. See the data section. We have created explanatory videos. You can see out video abstract and 3 videos showing different aspects of running the code and observing the results.
If you want to try this and have a similar question, don't hesitate to contact us at the Bioimage informatics national facility at SciLifLab (Sweden)
For this specific implementation of our methodology, you need to have the TMA slides as DZI pyramids. Cores should be annotated and there needs to be a simple text configuration file to know which slides should be processed and where information must be found.
DZI tiles can be created using VIPS and Openslide, which accept the formats as listed in their website:
*Aperio (.svs, .tif) *Hamamatsu (.vms, .vmu, .ndpi) *Leica (.scn) *MIRAX (.mrxs) *Philips (.tiff) *Sakura (.svslide) *Trestle (.tif) *Ventana (.bif, .tif) *Generic tiled TIFF (.tif)
You can have several TMA slides, stained for different proteins, you must choose the one all the others will be aligned to, this will be called "fixed" and the rest will be "moving". Additionally, cores locations must be in a polygon format in image coordinates.
This implementation expects a JSON file with the following structure:
{
"regions": {
"<protein name>":{
"<case>": #
"<block>": #
"property": "moving" or "fixed"
"regionX1":{
"id":
"points":[] //in normalized coordinates divigin by image width
"globalPoints": [] //points in image coordinates
"len": # //length of the point array
//bounding box
"_xmin", //normalized values
"_xmax", //normalized values
"_ymin", //normalized values
"_ymax", //normalized values
"_gxmin", //image coordinate values
"_gxmax", //image coordinate values
"_gymin", //image coordinate values
"_gymax", //image coordinate values
}
}
}
}
You need a CSV telling the code where everything is. Each line contains the values: protein ,block, format of slide, case, a prefix to the files (if any), the location of the associated dzi, and the name of the JSON describing cores.
Se examples below
To annotate the cores we use our own interface available in this repository in viewer. More information on how to use our viewer here: TissUUmaps
(Click here to see video instead of gif)
To start, we refer to this part as track 1. Follow these instructions, there is no need to go over all the repository. Just start here and, if you want, watch the video: The file blockAlign.py runs 3 steps:
To begin, make sure you specify in the script:
|
This track continues after everything has run in track 1. Sometimes, to solve the question, you need to perform tumor segmentation. We use Random Forests to do so. In here you will find a handy notebook to perform image segmentation based on sparse annotations. In our case, an expert pathologist marked several cores from the whole experiment to mark areas with tumor, non-tumor and background. These images (i.e. sparse seeds, masks, annotations) allow us to determine where to find distinctive features under them. To use the notebook go to the RandomForestTMmaster.ipynb notebook. To understand more about this process, watch the video |
After having a visual way of determining co-expression in cores in the maps we obtain in track 1. We might want to quantify the co-expression. Such quantification can be related to clinical data or become evidence for interesting unstudied relationships. To quantify using this implementation of the pipeline you need to have all the data mentioned before and have run all track 1 and tumor segmentation. The file colocQuantification.py is the one we use for this purpose. Here we will study the specific locations of shared pieces of tissue and within tumor segmentation to quantify co-expression. |
We provide two DZi pyramids and the annotated cores in json format, in case you want to try out the pipeline (the links take you to our website first):