2. Data ingest

Most of the relevant code is contained in the ingest module, with some functions from utils.

plaque_assay is launched when a pair of plates for a given workflow_id and variant are exported. The only data directly given to plaque_assay is a list containing 2 strings, which are paths to the the 2 replicate plates (plate_list: List[str])

variant = utils.get_variant_from_plate_list(plate_list, session)
workflow_id = utils.get_workflow_id_from_plate_list(plate_list)
dataset = ingest.read_data_from_list(plate_list)
indexfiles = ingest.read_indexfiles_from_list(plate_list)
dataset["variant"] = variant
indexfiles["variant"] = variaint
    
# do stuff with dataset

Parsing metadata from paths

The workflow_id is parsed directly from the plate barcodes within the path utils.get_workflow_id_from_plate_list(). It will error if the workflow_id is not the same for the two plates.
The variant is parsed as an integer from the path, which is then used to query the NE_available_strains table to obtain the name of the variant with utils.get_variant_from_plate_list(). It will fail with a VariantLookupError if there is no match in the database for that integer.

Reading in tables

The PlateResults.txt files are read in as pandas DataFrames with ingest.read_data_from_list().
- Reads in dataframes and concatenates them.
- Re-labels wells to 96-well format.
- Adds metadata: variant, barcode, dilution etc.
The indexfile.txt is read in as a pandas DataFrame.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

2. Data ingest

Parsing metadata from paths

Reading in tables

Clone this wiki locally