-
Notifications
You must be signed in to change notification settings - Fork 128
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: upper case root sequence for ancestral reconstruction #1323
Conversation
Thanks! Seems like a strictly beneficial change, better solutions can be added later. CI failing is because of monkeypox restructure, I will take mpox out of augur CI |
Merging this then CI should work again: #1324 |
Makes it possible to reproduce the error and check we actually fix it and don't regress
Makes it possible to reproduce the error and check we actually fix it and don't regress
Added a test that failed before the fix so that alternative fixes are easy to test and we don't regress in the future. See: https://github.com/nextstrain/augur/actions/runs/6315326703/job/17147616573#step:9:20 for what the bug looks like |
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #1323 +/- ##
=======================================
Coverage 72.25% 72.25%
=======================================
Files 79 79
Lines 8276 8276
Branches 1691 1691
=======================================
Hits 5980 5980
Misses 2011 2011
Partials 285 285 ☔ View full report in Codecov by Sentry. |
Makes it possible to reproduce the error and check we actually fix it and don't regress
augur/ancestral.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have enough background to understand why the change is needed. Is it because, for the purpose of ancestral reconstruction, "soft-masking" in the root sequence is irrelevant and should be ignored?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't know about soft vs hard masking. Indeed, we don't ever treat upper and lower case nucleotides differently anywhere in augur - at least not as far as I'm aware.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you're right - at least for augur index
, sequences are lowercased.
This was actually resolved by @jameshadfield in 947e217 in an identical way so we can close this PR |
I ran into this issue when the root sequence was provided as lower case. All lower-to-upper case changes were attached to the root. This should fix it, but there might be better solutions.