-
Notifications
You must be signed in to change notification settings - Fork 403
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
docs: convert to reST, re-organize and update contents across all pages #894
Conversation
|
||
.. code:: text | ||
|
||
nextstrain build . --cores 4 --configfile ncov-tutorial/genomic-surveillance.yaml |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This command had a subtle issue where the combined metadata (results/combined_metadata.tsv.xz
) and sequences (results/combined_sequences_for_subsampling.fasta.xz
) already existed from the "custom data" run, so the workflow didn't re-run the steps to combine these files and include the newly defined inputs. As a result the workflow crashed when trying to run an augur filter
query that referred to a nonexistent background_data
column.
To fix the issue, I had to tell the workflow to rebuild everything with the --forceall
flag:
nextstrain build . --forceall --configfile ncov-tutorial/genomic-surveillance.yaml
This seems like a Snakemake bug or a bug in our workflow; the new "background_data" entry and updated contents of the "custom_data" files should trigger a rebuild of files that depend on them.
I have to sign off for the day, but I will follow up with this issue by looking into potential differences between Snakemake versions. The Docker image still uses a relatively old Snakemake, so it's possible this is a Snakemake bug that has been fixed by a later release.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still looking into the DAG issue with Snakemake, but noting that this build ran in 42 min on 3 CPUs with the config in ncov-tutorial@cb2f69fe413a6b979687b52b897f6fdd2a8c4da9.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ran updated config in 27 min on an M1 Mac using 8 cores (native runtime).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@huddlej I am running this using Docker runtime on my computer (very slowly due to nextstrain/docker-base#35), but I am at the refine
step and can see that this already ran successfully:
python3 scripts/combine_metadata.py --metadata results/sanitized_metadata_reference_data.tsv.xz results/sanitized_metadata_custom_data.tsv.xz results/sanitized_metadata_background_data.tsv.xz --origins reference_data custom_data background_data --output results/combined_metadata.tsv.xz 2>&1 | tee logs/combine_input_metadata.txt
So, I don't think I am able to reproduce this 😕
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work @victorlin! I've read through and tested the first tutorial. Will try to cover the rest tomorrow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome @victorlin! Thanks for taking the lead on this. I think the three tutorials are great. In addition to the in-line comments, I have bigger picture comments here about removing / streamlining the other pages. Perhaps this is a separate PR?
remove my_profiles
The three tutorials plus the "preparing your data" page cover pretty much all the contents that ./my_profiles
did. How do we feel about
- removing
./reference/multiple_inputs
and corresponding./data/
files. - shift any useful config YAMLs into the
ncov-tutorial
page as examples. Perhaps this needs a corresponding page in the "reference material" along the lines of "more complicated configuration file examples"
reorder the pages in "reference material"
- Orientation / overview pages should come first
Content to be revised later.
pandoc -f markdown -t rst --wrap=none docs/src/reference/customizing-analysis.md -o docs/src/reference/customizing-analysis.rst
pandoc -f markdown -t rst --wrap=none docs/src/reference/orientation-files.md -o docs/src/reference/orientation-files.rst
pandoc -f markdown -t rst --wrap=none docs/src/guides/data-prep.md -o docs/src/guides/data-prep.rst
pandoc -f markdown -t rst --wrap=none docs/src/reference/customizing-visualization.md -o docs/src/reference/customizing-visualization.rst
pandoc -f markdown -t rst --wrap=none docs/src/reference/orientation-workflow.md -o docs/src/reference/orientation-workflow.rst
pandoc -f markdown -t rst --wrap=none docs/src/visualization/sharing.md -o docs/src/visualization/sharing.rst
pandoc -f markdown -t rst --wrap=none docs/src/tutorial/running.md -o docs/src/tutorial/running.rst
pandoc -f markdown -t rst --wrap=none docs/src/reference/metadata-fields.md -o docs/src/reference/metadata-fields.rst
pandoc -f markdown -t rst --wrap=none docs/src/reference/data_submitter_faq.md -o docs/src/reference/data_submitter_faq.rst rm docs/src/reference/data_submitter_faq.md
pandoc -f markdown -t rst --wrap=none docs/src/reference/naming_clades.md -o docs/src/reference/naming_clades.rst rm docs/src/reference/naming_clades.md
pandoc -f markdown -t rst --wrap=none docs/src/reference/remote_inputs.md -o docs/src/reference/remote_inputs.rst rm docs/src/reference/remote_inputs.md
pandoc -f markdown -t rst --wrap=none docs/src/visualization/interpretation.md -o docs/src/visualization/interpretation.rst rm docs/src/visualization/interpretation.md
pandoc -f markdown -t rst --wrap=none docs/src/visualization/narratives.md -o docs/src/visualization/narratives.rst rm docs/src/visualization/narratives.md
- Move existing files to https://github.com/nextstrain/ncov-tutorial/tree/main/examples - nextstrain/ncov-tutorial@9fa64b8 - Add placeholder README pointing readers to new guide (page will exist once tutorial PR changes merged)
To be referenced by tutorial.
b9c3adf
to
2557b78
Compare
- Add the new tutorial pages along with supporting images. - Update the reference to demo videos
0b9ebf4
to
f6d3b46
Compare
- Organizational changes: - Expose pages in main sidebar (6e35cdc) - Move pages to guides: - "Update the workflow" section from tutorial/setup -> guides/update-workflow - reference/customizing-analysis -> guides/workflow-config-file - reference/customizing-visualization -> guides/customizing-visualization - reference/data-prep -> guides/data-prep - Split "Data Prep" into 3 pages - Add reference/glossary - Rename reference files: - configuration -> workflow-config-file - orientation-files -> files - orientation-workflow -> nextstrain-overview - tutorial/running -> troubleshoot - Remove files: - reference/multiple_inputs - Changes across multiple files: - Fix MD->reST conversion glitches - Reference "builds.yaml" as "workflow config file" - Remove my_profiles/ references - Reference glossary terms where appropriate - Use sphinx reference directive [1] to link to specific sections - Per-file changes: - tutorial/setup - Remove basic example in setup page (replaced by the "example data" tutorial) - reference/gisaid-search - Remove off-topic line - reference/nextstrain-overview - Capitalize Augur, Auspice, Snakemake, Nextflow - Describe build vs. workflow - reference/files - Re-organize page with "user files" vs. "internal files" - reference/troubleshoot - Formerly tutorial/running, it has been stripped down to just troubleshooting content - dev_docs - Link to docs for installation/setup [1]: https://www.sphinx-doc.org/en/master/usage/restructuredtext/roles.html#cross-referencing-arbitrary-locations
Merging with @jameshadfield's approval on Slack. |
Preview
Related issues
Post-merge tasks
redirects.yml
using readthedocs-cli