You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
One of the most confusing points in the definition of schemas for (genomic) variants is the conflation of variant description for "prototypes" (e.g. recurring genomic changes of "equivalent" alteration - think "BRAF V600E" or "CNV region ..."), from single instances / observations. Examples for the missing separation between "observation/variant calling" and interpretation can be seen in the VCF format with the existence of multi-allelic "variants" and allele frequency values but used for the annotation of individual variant calls.
In discussing formats for annotating CNVs and (and possibly additional variant types) for data storage, representation and knowledge resources it will be helpful to have a clear scoping between the different logical types:
a variant "observation, "call" or "instance", which just represents the outcome of a technical analysis pipeline w/o any inference from outside data (apart from low-level calibration etc.)
population frequencies etc. play no role at this stage
"fuzzy" positions for start, end refer to technical uncertainties
"prototypes" of variants, e.g. of exact or "equivalent" observations (from n=1 => ++)
"fuzzy" positions for start, end here refer to e.g. variations in precise mapping of variants seen as "equivalent"
The proposal - apart from discussions about other parameters relevant for the one or other type - is to use a separate parameter for the branding of the scope here, e.g.:
"representation": "observation"
in contrast to
"representation": "evidence"
Such a model would a) help with some of the design discussions, and b) work nicely in the integration with different types of resources and APIs such as Beacon.
The text was updated successfully, but these errors were encountered:
One of the most confusing points in the definition of schemas for (genomic) variants is the conflation of variant description for "prototypes" (e.g. recurring genomic changes of "equivalent" alteration - think "BRAF V600E" or "CNV region ..."), from single instances / observations. Examples for the missing separation between "observation/variant calling" and interpretation can be seen in the VCF format with the existence of multi-allelic "variants" and allele frequency values but used for the annotation of individual variant calls.
In discussing formats for annotating CNVs and (and possibly additional variant types) for data storage, representation and knowledge resources it will be helpful to have a clear scoping between the different logical types:
The proposal - apart from discussions about other parameters relevant for the one or other type - is to use a separate parameter for the branding of the scope here, e.g.:
in contrast to
Such a model would a) help with some of the design discussions, and b) work nicely in the integration with different types of resources and APIs such as Beacon.
The text was updated successfully, but these errors were encountered: