Use API/PL for Genotype Matrix & Hapmap format #51
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Metadata
Documentation:
Description
This PR updates the Genotype Matrix and HAPMAP formats to use the
vcf_filter_read_VCF_into_array()
API function. This has the benefit of removing repetitive code. Furthermore, the API function corrects the GT using the PL. This PR also adds simple tests for the genotype matrix and hapmap formats.Testing?
This PR includes automated testing :-D Please check that the files
short.genotype_matrix.txt
andshort.hapmap.txt
match what you expect. NOTE: when comparing with raw vcf, the calls may not match the GT... rather use the PL to determine what you should expect.Manual Testing: (NOT NECESSARY due to automated testing ;-P)
A) Create a genotype matrix and hapmap version of a given file before switching to the branch and after switching to the branch. You should notice small differences in the genotype calls based on the PL correction and a fix to the first line of SNPs (added newline).
B) Check the calls from the VCF and compare to the output genotype matrix and hapmap files. Ensure that the calls are what you expect.