Skip to content

Commit

Permalink
Document new features in change log and configuration reference
Browse files Browse the repository at this point in the history
  • Loading branch information
huddlej committed May 26, 2021
1 parent 115cb99 commit e45903e
Show file tree
Hide file tree
Showing 2 changed files with 20 additions and 0 deletions.
1 change: 1 addition & 0 deletions docs/change_log.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@ We also use this change log to document new features that maintain backward comp

## New features since last version update

- 26 May 2021: Support default (full) GISAID metadata and sequences from the "Download packages" interface by converting this default format into Nextstrain-compatible metadata. Additionally, the workflow now deduplicates metadata and sequences at the beginning and also supports reading metadata and sequences directly from GISAID's tar archives. ([#640](https://github.com/nextstrain/ncov/pull/640))
- 25 May 2021: Support custom Auspice JSON prefixes with a new configuration parameter, `auspice_json_prefix`. [See the configuration reference for more details](https://nextstrain.github.io/ncov/configuration.html#auspice_json_prefix). ([#643](https://github.com/nextstrain/ncov/pull/643))

## v6 (20 May 2021)
Expand Down
19 changes: 19 additions & 0 deletions docs/configuration.md
Original file line number Diff line number Diff line change
Expand Up @@ -497,6 +497,25 @@ Valid attributes for list entries in `inputs` are provided below.
* description: A list of prefixes to strip from strain names in metadata and sequence records to maintain consistent strain names when analyzing data from multiple sources.
* default: `["hCoV-19/", "SARS-CoV-2/"]`

## sanitize_metadata
* type: object
* description: Parameters to configure how to sanitize metadata to a Nextstrain-compatible format.

### parse_location_field
* type: string
* description: Field in the metadata that stores GISAID-formatted location details (e.g., `North America / USA / Washington`) to be parsed into `region`, `country`, `division`, and `location` fields.
* default: `Location`

### rename_fields
* type: array
* description: List of key/value pairs mapping fields in the input metadata to rename to another value in the sanitized metadata.
* default: `["Virus name=strain", "Collection date=date"]`

### standardize_columns
* type: boolean
* description: Standardize column names by lowercasing and replacing all whitespace with underscores. This operation happens after renaming fields.
* default: `true`

## subsampling
* type: object
* description: Schemes for subsampling data prior to phylogenetic inference to avoid sampling bias or focus an analysis on specific spatial and/or temporal scales. [See the SARS-CoV-2 tutorial for more details on defining subsampling schemes](https://docs.nextstrain.org/en/latest/tutorials/SARS-CoV-2/steps/customizing-analysis.html#subsampling).
Expand Down

0 comments on commit e45903e

Please sign in to comment.