DPL-471-2: malformed root_sample_ids with duplicates in MLWH, unpicked #537
Labels
Data integrity
data fix
Enhancement
New feature or request
GSU
Delivers work for the GSU unit
Heron
RVI
RVI Project
User Story
Part of the wider DPL-471 issue which spawned in turn from DPL-048. Related are DPL-0483-2, DPL-0483-4 and DPL-0483-5.
This story concerns the 18,138 malformed root_sample_ids in the MLWH
lighthouse_sample
table which have duplicate samples with the correct root_sample_id. These came from the MK lighthouse lab in August 2021 and have an extra substring (something like '_RNA123456789') concatenated on the end of the correct ID.Fix
The main issue is that since these are duplicated, we cannot simply fix the root_sample_id. The root_sample_id/plate_barcode/coordinate combination must be unique, and the fixed IDs break this uniqueness. The suggestion (this needs to be run past everyone involved!!) is that we should delete the relevant rows, since they were never picked and only exist in the MongoDB/MLWH DBs.
Since these samples were never picked they are not in SequenceScape, Event Warehouse or the MLWH
sample
table, so the only places that these need to be fixed are in the MongoDBsample
table and MLWHlighthouse_sample
table.Who are the primary contacts for this story
Jonnie B
Alan K
Acceptance criteria
The text was updated successfully, but these errors were encountered: