-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
handle overlapping deletions #62
Conversation
@@ -29,5 +29,5 @@ Config/testthat/edition: 3 | |||
License: GPL-2 | |||
Encoding: UTF-8 | |||
LazyData: true | |||
RoxygenNote: 7.3.1 | |||
RoxygenNote: 7.3.2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any particular reason why this incremented?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was done automatically by devtools::document()
, possibly I updated Roxygen since the last time, not sure though.
@@ -345,7 +345,12 @@ apply.polygenic.score <- function( | |||
sample.coordinate.to.row.dict.hash <- new.env(hash = TRUE, parent = emptyenv()); | |||
|
|||
for (i in 1:nrow(merged.vcf.with.pgs.data)) { | |||
# skip if the variant is missing from all samples | |||
if (is.na(merged.vcf.with.pgs.data[i, 'Indiv'])) { | |||
next; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be good to include a message about the skipped variant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is deep in the multiallelic site handling algorithm, it's not useful to the user. Missing sites are reported in the final output.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code looks good to me! Just make sure to update the date/version and add the functional changes to the changelog.
Should I do that with this PR or wait until the release PR? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
I usually just keep it up to date as I go along, but the other PR is up already so I'm fine with just handling it there. |
I guess what I'm confused about is I already did update NEWS/changelog with my changes under "unreleased" but it sounds like you want me to also increment the version even when I am not ready to make another release. I didn't think the version should be incremented after every PR? |
Ah, sorry for the confusion. You did it right, I just didn't see the NEWS.md changes in this PR. All good! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree about how you handled the * as a non-risk allele!
A small unrelated suggestion is to consider testing strand flip.
Good point, I've started an issue for potential new feature in the future. |
#61 Outlines a bug: the notation used to indicate an overlapping deletion in a genotype call is not recognized by
convert.alleles.to.pgs.dosage
.More info here: https://gatk.broadinstitute.org/hc/en-us/articles/360035531912-Spanning-or-overlapping-deletions-allele
The asterisk
*
is used in VCF format to denote such an allele, and it is not a recognized allele format by the dosage calculator functionconvert.alleles.to.pgs.dosage
.I have decided to add
*
to the allowed list of sample alleles but not to the list of allowed risk alleles. This means that an*
will always be treated as non-risk and will contribute a dosage of 0.@RoniHaas would you mind weighing in on whether you've encountered this notation in your research and how you think they should be handled?
convert.alleles.to.pgs.dosage
to recognize*
as a non-risk allele.apply.polygenic.score
andcreate.pgs.with.continuous.phenotype
I have read the code review guidelines and the code review best practice on GitHub check-list.
The name of the branch is meaningful and well formatted following the standards, using [AD_username (or 5 letters of AD if AD is too long)-[brief_description_of_branch].
I have set up or verified the branch protection rule following the github standards before opening this pull request.
I have added the changes included in this pull request to
NEWS
under the next release version or unreleased, and updated the date.I have updated the version number in
metadata.yaml
andDESCRIPTION
.Both
R CMD build
andR CMD check
run successfully.Closes #61
Testing Results
All tests PASS