You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Since the loader is already saving meta-data for each genotype call in the VCF format, it would be an extremely useful feature to allow the user to set thresholds for what constitutes a good/bad quality call, and then allow that information to be stored directly into the database, also as meta-data. For our purposes, it is more efficient to calculate and store quality metrics than to perform these calculations on the front-end every time user requests for genotype calls through a web interface occurs.
For example,
The user specifies they want a quality of metric of "NDQR" to be saved as meta-data
The user also provides thresholds, let's say lower and upper thresholds, to categorize each call as "bad", "acceptable" or "excellent" quality. These thresholds are set based on meta-data that is already present in the genotype file. In this case, the user may specify a lower-threshold as having a read depth (DP) of 5 and allele depth (AD) of 4, whereas an upper threshold requires a read depth of 50 and allele depth of 45.
We would also like to extend this to all file formats and not just VCF, by allowing an optional column within the legacy format, and additional parsing capabilities within the genotype call column of the matrix format for key-value pairs. This provides the user the most flexibility to specify quality in whatever method they want based on their data (for example, if they happen to know read depth or have a percentage-based quality score), regardless of file format.
For reference, this is our (myself and @laceysanderson) thought process about this on the whiteboard:
The text was updated successfully, but these errors were encountered:
Since the loader is already saving meta-data for each genotype call in the VCF format, it would be an extremely useful feature to allow the user to set thresholds for what constitutes a good/bad quality call, and then allow that information to be stored directly into the database, also as meta-data. For our purposes, it is more efficient to calculate and store quality metrics than to perform these calculations on the front-end every time user requests for genotype calls through a web interface occurs.
For example,
We would also like to extend this to all file formats and not just VCF, by allowing an optional column within the legacy format, and additional parsing capabilities within the genotype call column of the matrix format for key-value pairs. This provides the user the most flexibility to specify quality in whatever method they want based on their data (for example, if they happen to know read depth or have a percentage-based quality score), regardless of file format.
For reference, this is our (myself and @laceysanderson) thought process about this on the whiteboard:
The text was updated successfully, but these errors were encountered: