Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DPL-471-2: malformed root_sample_ids with duplicates in MLWH, unpicked #537

Open
3 tasks
Jonnie-Bevan opened this issue Mar 17, 2022 · 0 comments
Open
3 tasks
Labels
Data integrity data fix Enhancement New feature or request GSU Delivers work for the GSU unit Heron RVI RVI Project

Comments

@Jonnie-Bevan
Copy link

Jonnie-Bevan commented Mar 17, 2022

User Story
Part of the wider DPL-471 issue which spawned in turn from DPL-048. Related are DPL-0483-2, DPL-0483-4 and DPL-0483-5.

This story concerns the 18,138 malformed root_sample_ids in the MLWH lighthouse_sample table which have duplicate samples with the correct root_sample_id. These came from the MK lighthouse lab in August 2021 and have an extra substring (something like '_RNA123456789') concatenated on the end of the correct ID.

Fix
The main issue is that since these are duplicated, we cannot simply fix the root_sample_id. The root_sample_id/plate_barcode/coordinate combination must be unique, and the fixed IDs break this uniqueness. The suggestion (this needs to be run past everyone involved!!) is that we should delete the relevant rows, since they were never picked and only exist in the MongoDB/MLWH DBs.

Since these samples were never picked they are not in SequenceScape, Event Warehouse or the MLWH sample table, so the only places that these need to be fixed are in the MongoDB sample table and MLWH lighthouse_sample table.

Who are the primary contacts for this story
Jonnie B
Alan K

Acceptance criteria

  • data are fixed (deleted?) in MLWH lighthouse_sample table
  • data are fixed (deleted?) in MongoDB sample table
  • have double-checked that this data does not exist in SS/Ev. Warehouse/MLWH Sample table.
@Jonnie-Bevan Jonnie-Bevan added Enhancement New feature or request Data integrity data fix labels Mar 17, 2022
@Jonnie-Bevan Jonnie-Bevan self-assigned this Mar 22, 2022
@stevieing stevieing added the Heron label Apr 7, 2022
@sdjmchattie sdjmchattie changed the title DPL-048-3: malformed root_sample_ids with duplicates in MLWH, unpicked DPL-471-2: malformed root_sample_ids with duplicates in MLWH, unpicked Aug 24, 2022
@andrewsparkes andrewsparkes added the RVI RVI Project label Sep 7, 2022
@TWJW-SANGER TWJW-SANGER added the GSU Delivers work for the GSU unit label Sep 28, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Data integrity data fix Enhancement New feature or request GSU Delivers work for the GSU unit Heron RVI RVI Project
Projects
None yet
Development

No branches or pull requests

4 participants