Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: convert to reST, re-organize and update contents across all pages #894

Merged
merged 25 commits into from
May 2, 2022

Conversation

victorlin
Copy link
Member

@victorlin victorlin commented Mar 22, 2022

Preview

  • Convert pages to reST
  • Rewrite tutorial
  • Update contents of existing pages to reflect current workflow
  • Re-organize content

Related issues

Post-merge tasks

@victorlin victorlin self-assigned this Mar 22, 2022

.. code:: text

nextstrain build . --cores 4 --configfile ncov-tutorial/genomic-surveillance.yaml
Copy link
Contributor

@huddlej huddlej Mar 23, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This command had a subtle issue where the combined metadata (results/combined_metadata.tsv.xz) and sequences (results/combined_sequences_for_subsampling.fasta.xz) already existed from the "custom data" run, so the workflow didn't re-run the steps to combine these files and include the newly defined inputs. As a result the workflow crashed when trying to run an augur filter query that referred to a nonexistent background_data column.

To fix the issue, I had to tell the workflow to rebuild everything with the --forceall flag:

nextstrain build . --forceall  --configfile ncov-tutorial/genomic-surveillance.yaml

This seems like a Snakemake bug or a bug in our workflow; the new "background_data" entry and updated contents of the "custom_data" files should trigger a rebuild of files that depend on them.

I have to sign off for the day, but I will follow up with this issue by looking into potential differences between Snakemake versions. The Docker image still uses a relatively old Snakemake, so it's possible this is a Snakemake bug that has been fixed by a later release.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still looking into the DAG issue with Snakemake, but noting that this build ran in 42 min on 3 CPUs with the config in ncov-tutorial@cb2f69fe413a6b979687b52b897f6fdd2a8c4da9.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ran updated config in 27 min on an M1 Mac using 8 cores (native runtime).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@huddlej I am running this using Docker runtime on my computer (very slowly due to nextstrain/docker-base#35), but I am at the refine step and can see that this already ran successfully:

python3 scripts/combine_metadata.py --metadata results/sanitized_metadata_reference_data.tsv.xz results/sanitized_metadata_custom_data.tsv.xz results/sanitized_metadata_background_data.tsv.xz --origins reference_data custom_data background_data --output results/combined_metadata.tsv.xz 2>&1 | tee logs/combine_input_metadata.txt

So, I don't think I am able to reproduce this 😕

Copy link
Member

@jameshadfield jameshadfield left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work @victorlin! I've read through and tested the first tutorial. Will try to cover the rest tomorrow.

docs/src/tutorial/example-data.rst Show resolved Hide resolved
docs/src/tutorial/example-data.rst Outdated Show resolved Hide resolved
docs/src/tutorial/example-data.rst Show resolved Hide resolved
docs/src/tutorial/example-data.rst Show resolved Hide resolved
Copy link
Member

@jameshadfield jameshadfield left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome @victorlin! Thanks for taking the lead on this. I think the three tutorials are great. In addition to the in-line comments, I have bigger picture comments here about removing / streamlining the other pages. Perhaps this is a separate PR?

remove my_profiles
The three tutorials plus the "preparing your data" page cover pretty much all the contents that ./my_profiles did. How do we feel about

  • removing ./reference/multiple_inputs and corresponding ./data/ files.
  • shift any useful config YAMLs into the ncov-tutorial page as examples. Perhaps this needs a corresponding page in the "reference material" along the lines of "more complicated configuration file examples"

reorder the pages in "reference material"

  • Orientation / overview pages should come first

docs/src/tutorial/custom-data.rst Outdated Show resolved Hide resolved
docs/src/tutorial/custom-data.rst Outdated Show resolved Hide resolved
docs/src/tutorial/example-data.rst Show resolved Hide resolved
docs/src/tutorial/custom-data.rst Outdated Show resolved Hide resolved
docs/src/tutorial/genomic-surveillance.rst Outdated Show resolved Hide resolved
docs/src/tutorial/example-data.rst Show resolved Hide resolved
docs/src/tutorial/genomic-surveillance.rst Show resolved Hide resolved
docs/src/tutorial/running.md Outdated Show resolved Hide resolved
docs/src/tutorial/custom-data.rst Show resolved Hide resolved
Content to be revised later.
pandoc -f markdown -t rst --wrap=none docs/src/reference/customizing-analysis.md -o docs/src/reference/customizing-analysis.rst
pandoc -f markdown -t rst --wrap=none docs/src/reference/orientation-files.md -o docs/src/reference/orientation-files.rst
pandoc -f markdown -t rst --wrap=none docs/src/guides/data-prep.md -o docs/src/guides/data-prep.rst
pandoc -f markdown -t rst --wrap=none docs/src/reference/customizing-visualization.md -o docs/src/reference/customizing-visualization.rst
pandoc -f markdown -t rst --wrap=none docs/src/reference/orientation-workflow.md -o docs/src/reference/orientation-workflow.rst
pandoc -f markdown -t rst --wrap=none docs/src/visualization/sharing.md -o docs/src/visualization/sharing.rst
pandoc -f markdown -t rst --wrap=none docs/src/tutorial/running.md -o docs/src/tutorial/running.rst
pandoc -f markdown -t rst --wrap=none docs/src/reference/metadata-fields.md -o docs/src/reference/metadata-fields.rst
pandoc -f markdown -t rst --wrap=none docs/src/reference/data_submitter_faq.md -o docs/src/reference/data_submitter_faq.rst
rm docs/src/reference/data_submitter_faq.md
pandoc -f markdown -t rst --wrap=none docs/src/reference/naming_clades.md -o docs/src/reference/naming_clades.rst
rm docs/src/reference/naming_clades.md
pandoc -f markdown -t rst --wrap=none docs/src/reference/remote_inputs.md -o docs/src/reference/remote_inputs.rst
rm docs/src/reference/remote_inputs.md
pandoc -f markdown -t rst --wrap=none docs/src/visualization/interpretation.md -o docs/src/visualization/interpretation.rst
rm docs/src/visualization/interpretation.md
pandoc -f markdown -t rst --wrap=none docs/src/visualization/narratives.md -o docs/src/visualization/narratives.rst
rm docs/src/visualization/narratives.md
- Move existing files to https://github.com/nextstrain/ncov-tutorial/tree/main/examples
    - nextstrain/ncov-tutorial@9fa64b8
- Add placeholder README pointing readers to new guide (page will exist once tutorial PR changes merged)
To be referenced by tutorial.
@victorlin victorlin force-pushed the tutorial branch 2 times, most recently from b9c3adf to 2557b78 Compare April 29, 2022 05:06
- Add the new tutorial pages along with supporting images.
- Update the reference to demo videos
@victorlin victorlin force-pushed the tutorial branch 2 times, most recently from 0b9ebf4 to f6d3b46 Compare April 29, 2022 18:18
- Organizational changes:
    - Expose pages in main sidebar (6e35cdc)
    - Move pages to guides:
        - "Update the workflow" section from tutorial/setup -> guides/update-workflow
        - reference/customizing-analysis -> guides/workflow-config-file
        - reference/customizing-visualization -> guides/customizing-visualization
        - reference/data-prep -> guides/data-prep
            - Split "Data Prep" into 3 pages
    - Add reference/glossary
    - Rename reference files:
        - configuration -> workflow-config-file
        - orientation-files -> files
        - orientation-workflow -> nextstrain-overview
        - tutorial/running -> troubleshoot
    - Remove files:
        - reference/multiple_inputs
- Changes across multiple files:
    - Fix MD->reST conversion glitches
    - Reference "builds.yaml" as "workflow config file"
    - Remove my_profiles/ references
    - Reference glossary terms where appropriate
    - Use sphinx reference directive [1] to link to specific sections
- Per-file changes:
    - tutorial/setup
        - Remove basic example in setup page (replaced by the "example data" tutorial)
    - reference/gisaid-search
        - Remove off-topic line
    - reference/nextstrain-overview
        - Capitalize Augur, Auspice, Snakemake, Nextflow
        - Describe build vs. workflow
    - reference/files
        - Re-organize page with "user files" vs. "internal files"
    - reference/troubleshoot
        - Formerly tutorial/running, it has been stripped down to just troubleshooting content
    - dev_docs
        - Link to docs for installation/setup

[1]: https://www.sphinx-doc.org/en/master/usage/restructuredtext/roles.html#cross-referencing-arbitrary-locations
@victorlin
Copy link
Member Author

Merging with @jameshadfield's approval on Slack.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
5 participants