Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Specification for genetic conditions and haplotypes #425

Open
cookeac opened this issue Oct 19, 2023 · 3 comments
Open

Specification for genetic conditions and haplotypes #425

cookeac opened this issue Oct 19, 2023 · 3 comments

Comments

@cookeac
Copy link
Collaborator

cookeac commented Oct 19, 2023

Proposal or request provided by an ICAR ADE specification user:

I am trying to determine the appropriate ICAR API schemas to use for genomic test results and disease test results.

I can see how we can use the diagnoses schema for milk culture result entry. However, for disease testing, users tend to want to set a permanent flag on animals showing their status. This is used especially for those needing to export a report for traceability upon shipping the animal to another location.

The standard data fields are:
Date of test
Test type
Result (positive, negative, or inconclusive)
Result Value (numeric)
I can also see the need to include values for the source of the result and test method.

@cookeac
Copy link
Collaborator Author

cookeac commented Aug 8, 2024

Discussed 2024-08-08
We could see a few options for this:

  1. Implement one of the existing clinical laboratory or testing laboratory data standards. These already exist (often in XML, but are transferrable). They are likely more complex than our requirements but the tools exist to specify a concrete subset.
  2. Implement a test result event that captured the animal/location/date, laboratory identifier, sample Id, and then an array of results (for instance Andrew's team had implemented one with an array of these items).
  3. Implement a test result event as above, but have it contain an array of icarDiagnosisType.

We ask all members to talk to their teams who deal with test results or geneticists who deal with genetic tests whether they have sample data and/or what data fields they capture. This would help us decide on an approach.

@cvigorsICBF
Copy link
Collaborator

cvigorsICBF commented Aug 9, 2024

Some things to consider:

Test type - Assuming this is the kind of test carried out, e.g. mystatinV1, mystatinV2, micro satellites, BVD (Virus) Elisa etc. This is likely to be an ever changing list. Suggest renaming to 'Test Name'
Sample type - e.g. blood, hair, ear punch etc.
Result - possibly too simple with just 3 fields. BVD lab test, for example could be "positive" or "dam positive" and more. Might be better as a string, suggest rename to 'Interpreted result', or something similar?
Result value - possible scenario where the value is not numeric is a genetic test on animal coat colour.
Lab - Identifier for the lab carrying out the test.
Lab reference number.

Suggestion is to get some sample scenarios of laboratory tests for disease and genetic conditions that need to be handled by this feature.

A lot of the fields in this does seem more applicable, especially the inclusion of unit.

@cookeac cookeac changed the title Specification for laboratory tests for disease and genetic conditions Specification for genetic conditions and haplotypes Oct 3, 2024
@cookeac
Copy link
Collaborator Author

cookeac commented Oct 3, 2024

Discussed in the US/Canada time zone meeting on 2024-10-02.

Potential confusion between lab tests (general) and genetic condition test results (the target of this).

  • Outcome: I have renamed the issue to be more specific.

Is there a standardised list of genetic conditions and haplotypes? Check with ICAR Interbull specs.

  • There doesn't seem to be a global list, though individual breed societies and national breed organisations each have a list that they focus on.
  • The Interbull specs don't include this. GenoEx has specifications for sharing SNPs and genomes, but not these results.

Examples include:

And extracts from the following list:

NAME CARRIER_CODE FREE_CODE SUFFERER_CODE
Achondrodysplasie AC+ AC- AC*
Arthrogrypose AS+ AS- AS*
Axonopathy AX+ AX- AX*
Holstein Haplotype 1 H1+ H1- H1*
Holstein Haplotype 2 H2+ H2- H2*
Holstein Haplotype 3 H3+ H3- H3*

Most lists seem to use a name, a 2-3 character abbreviation, and then an indicator or additional character for free, carrier, or affected.

Should this be an event with a date?

  • A test is definitely an event that occurs at a specific date and likely involves sampling and a specific laboratory.
  • However, by the time these have passed back through an organisation for reporting, the date and laboratory may not be available.
  • The genetic conditions themselves don't change over time unless the test prediction was incorrect.

Proposed model

An API call that can return data for all animals at a location, or be filtered to a specific animal.
Each resource is identified to the animal, and may have the processing organisation (or laboratory) and (analysis) eventDate.
Each resource also has an array of results, each with:

  • identifier - a scheme and ID identifying the condition or haplotype. Ideally ICAR would build a database of unique conditions.
  • name - a descriptive name of the condition (e.g. "Holstein Haplotype 1")
  • status - the status of the condition, an enumerated value comprising "Free", "Carrier", "Affected"
  • copyCount - optional, but a few genetic conditions such as double-muscling can have multiple copies of the gene.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants