Skip to content

Commit

Permalink
Move phylogenetic workflow from top-level to folder phylogenetic
Browse files Browse the repository at this point in the history
  • Loading branch information
j23414 committed Nov 17, 2023
1 parent 7b03fb0 commit 20eb826
Show file tree
Hide file tree
Showing 13 changed files with 95 additions and 83 deletions.
90 changes: 7 additions & 83 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,88 +1,12 @@
# nextstrain.org/zika
# Nextstrain repository for Zika virus

This is the [Nextstrain](https://nextstrain.org) build for Zika, visible at
[nextstrain.org/zika](https://nextstrain.org/zika).
This repository contains two workflows for the analysis of Zika virus data:

The build encompasses fetching data, preparing it for analysis, doing quality
control, performing analyses, and saving the results in a format suitable for
visualization (with [auspice][]). This involves running components of
Nextstrain such as [fauna][] and [augur][].
- [`ingest/`](./ingest) - Download data from GenBank, clean and curate it and upload it to S3
- [`phylogenetic/`](./phylogenetic) - Make phylogenetic trees for nextstrain.org

All Zika-specific steps and functionality for the Nextstrain pipeline should be
housed in this repository.
Each folder contains a README.md with more information.

_This build requires Augur v6._
## Documentation

[![Build Status](https://github.com/nextstrain/zika/actions/workflows/ci.yaml/badge.svg?branch=main)](https://github.com/nextstrain/zika/actions/workflows/ci.yaml)

## Usage

If you're unfamiliar with Nextstrain builds, you may want to follow our
[quickstart guide][] first and then come back here.

There are two main ways to run & visualise the output from this build:

The first, and easiest, way to run this pathogen build is using the [Nextstrain
command-line tool][nextstrain-cli]:
```
nextstrain build .
nextstrain view auspice/
```

See the [nextstrain-cli README][] for how to install the `nextstrain` command.

The second is to install augur & auspice using conda, following [these instructions](https://nextstrain.org/docs/getting-started/local-installation#install-augur--auspice-with-conda-recommended).
The build may then be run via:
```
snakemake
auspice --datasetDir auspice/
```

Build output goes into the directories `data/`, `results/` and `auspice/`.

## Configuration

Configuration takes place entirely with the `Snakefile`. This can be read top-to-bottom, each rule
specifies its file inputs and output and also its parameters. There is little redirection and each
rule should be able to be reasoned with on its own.


## Input data

This build starts by downloading sequences from
https://data.nextstrain.org/files/zika/sequences.fasta.xz
and metadata from
https://data.nextstrain.org/files/zika/metadata.tsv.gz.
These are publicly provisioned data by the Nextstrain team by pulling sequences
from NCBI GenBank via ViPR and performing
[additional bespoke curation](https://github.com/nextstrain/fauna/blob/master/builds/ZIKA.md).

Data from GenBank follows Open Data principles, such that we can make input data
and intermediate files available for further analysis. Open Data is data that
can be freely used, re-used and redistributed by anyone - subject only, at most,
to the requirement to attribute and sharealike.

We gratefully acknowledge the authors, originating and submitting laboratories
of the genetic sequences and metadata for sharing their work in open databases.
Please note that although data generators have generously shared data in an open
fashion, that does not mean there should be free license to publish on this
data. Data generators should be cited where possible and collaborations should
be sought in some circumstances. Please try to avoid scooping someone else's
work. Reach out if uncertain. Authors, paper references (where available) and
links to GenBank entries are provided in the metadata file.

A faster build process can be run working from example data by copying over
sequences and metadata from `example_data/` to `data/` via:
```
mkdir -p data/
cp -v example_data/* data/
```

[Nextstrain]: https://nextstrain.org
[fauna]: https://github.com/nextstrain/fauna
[augur]: https://github.com/nextstrain/augur
[auspice]: https://github.com/nextstrain/auspice
[snakemake cli]: https://snakemake.readthedocs.io/en/stable/executable.html#all-options
[nextstrain-cli]: https://github.com/nextstrain/cli
[nextstrain-cli README]: https://github.com/nextstrain/cli/blob/master/README.md
[quickstart guide]: https://nextstrain.org/docs/getting-started/quickstart
- [Contributor documentation](./CONTRIBUTING.md)
88 changes: 88 additions & 0 deletions phylogenetic/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,88 @@
# nextstrain.org/zika

This is the [Nextstrain](https://nextstrain.org) build for Zika, visible at
[nextstrain.org/zika](https://nextstrain.org/zika).

The build encompasses fetching data, preparing it for analysis, doing quality
control, performing analyses, and saving the results in a format suitable for
visualization (with [auspice][]). This involves running components of
Nextstrain such as [fauna][] and [augur][].

All Zika-specific steps and functionality for the Nextstrain pipeline should be
housed in this repository.

_This build requires Augur v6._

[![Build Status](https://github.com/nextstrain/zika/actions/workflows/ci.yaml/badge.svg?branch=main)](https://github.com/nextstrain/zika/actions/workflows/ci.yaml)

## Usage

If you're unfamiliar with Nextstrain builds, you may want to follow our
[quickstart guide][] first and then come back here.

There are two main ways to run & visualise the output from this build:

The first, and easiest, way to run this pathogen build is using the [Nextstrain
command-line tool][nextstrain-cli]:
```
nextstrain build .
nextstrain view auspice/
```

See the [nextstrain-cli README][] for how to install the `nextstrain` command.

The second is to install augur & auspice using conda, following [these instructions](https://nextstrain.org/docs/getting-started/local-installation#install-augur--auspice-with-conda-recommended).
The build may then be run via:
```
snakemake
auspice --datasetDir auspice/
```

Build output goes into the directories `data/`, `results/` and `auspice/`.

## Configuration

Configuration takes place entirely with the `Snakefile`. This can be read top-to-bottom, each rule
specifies its file inputs and output and also its parameters. There is little redirection and each
rule should be able to be reasoned with on its own.


## Input data

This build starts by downloading sequences from
https://data.nextstrain.org/files/zika/sequences.fasta.xz
and metadata from
https://data.nextstrain.org/files/zika/metadata.tsv.gz.
These are publicly provisioned data by the Nextstrain team by pulling sequences
from NCBI GenBank via ViPR and performing
[additional bespoke curation](https://github.com/nextstrain/fauna/blob/master/builds/ZIKA.md).

Data from GenBank follows Open Data principles, such that we can make input data
and intermediate files available for further analysis. Open Data is data that
can be freely used, re-used and redistributed by anyone - subject only, at most,
to the requirement to attribute and sharealike.

We gratefully acknowledge the authors, originating and submitting laboratories
of the genetic sequences and metadata for sharing their work in open databases.
Please note that although data generators have generously shared data in an open
fashion, that does not mean there should be free license to publish on this
data. Data generators should be cited where possible and collaborations should
be sought in some circumstances. Please try to avoid scooping someone else's
work. Reach out if uncertain. Authors, paper references (where available) and
links to GenBank entries are provided in the metadata file.

A faster build process can be run working from example data by copying over
sequences and metadata from `example_data/` to `data/` via:
```
mkdir -p data/
cp -v example_data/* data/
```

[Nextstrain]: https://nextstrain.org
[fauna]: https://github.com/nextstrain/fauna
[augur]: https://github.com/nextstrain/augur
[auspice]: https://github.com/nextstrain/auspice
[snakemake cli]: https://snakemake.readthedocs.io/en/stable/executable.html#all-options
[nextstrain-cli]: https://github.com/nextstrain/cli
[nextstrain-cli README]: https://github.com/nextstrain/cli/blob/master/README.md
[quickstart guide]: https://nextstrain.org/docs/getting-started/quickstart
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.

0 comments on commit 20eb826

Please sign in to comment.