-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implementing Base Count Coverage Depth #7
Conversation
Yes, Script 1: |
Second script added no errors |
NB: these scripts are mostly reading/ parsing/ matching - not much core logic to write tests for easily. Hence there are no tests. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor Open Questions to @AugusteRi
# filtering condition to take only Artic v4.1 protocol: | ||
# (timeline_file["proto"] == "v41") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@AugusteRi Do we need to integrate the protocol as another filter? Is this an argument that should be crucial to pass down? You didn't integrate it in the file naming before, so I excluded it for now.
|
||
selected_rows = timeline_file[ | ||
# select the rows with date from 2022-07 to 2023-03 (according to samples.wastewateronly.ready.tsv) | ||
(timeline_file["date"] > "2024-01-01") & |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@AugusteRi Do you typically change the start date? Is it worth adding this as another parameter to the final command? I assume the start date is rarely changed, as you did before, so I also excluded it from the final command.
This Pull Request (PR) aims to integrate quality control scripts from @AugusteRi in this new package
usefulGnom
.Aplogies, this PR is messy and includes some unrelated project setup.
All three scripts can now be run as
snakemake
rules flexible tolocation
,enddate
of samples:I`ve verified the outputs against the original scripts on Euler.
For the full analysis for
Zürich
with the last samples from the07_03
theworkflows
directory one can now simply run:or for individual output files.
To configure this script, you need to edit the paths in
workflows/base_coverage. stick
for your personal Euler setup and the location of data files.Open Questions:
@AugusteRi I hope I integrated the script as it's intended to be run. I stuck as closely to your setup as I understood it, this leaves ad-hoc filters you had in your code not integrated as of now.
protocoll
as another filter?startdate
?More generally,
.tsv
formats and datafiles ? I've assumed so, and integrated the reading and parsing intousefulGnom
for future scripts.