Skip to content

Commit

Permalink
Merge branch 'master' into fix/root-state-in-sequence-reconstruction
Browse files Browse the repository at this point in the history
  • Loading branch information
huddlej authored Dec 23, 2024
2 parents 5b717d6 + 77ae31e commit f556cc9
Show file tree
Hide file tree
Showing 24 changed files with 1,318 additions and 109 deletions.
17 changes: 10 additions & 7 deletions .github/workflows/ci.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -41,19 +41,18 @@ jobs:
strategy:
matrix:
python-version:
- '3.8'
- '3.9'
- '3.10'
- '3.11'
- '3.12'
biopython-version:
# list of Biopython versions with support for a new Python version
# from https://github.com/biopython/biopython/blob/master/NEWS.rst
biopython-version:
# list of Biopython versions with support for a new Python version
# from https://github.com/biopython/biopython/blob/master/NEWS.rst
- '1.80' # first to support Python 3.10 and 3.11
- '1.82' # first to support Python 3.12
- '' # latest
exclude:
# some older Biopython versions are incompatible with later Python versions
- '' # latest
exclude:
# some older Biopython versions are incompatible with later Python versions
- { biopython-version: '1.80', python-version: '3.12' }
defaults:
run:
Expand Down Expand Up @@ -115,7 +114,11 @@ jobs:
- lassa
- measles
- mpox
- oropouche
- rabies
- seasonal-cov
- wnv
- yellow-fever
- zika

name: pathogen-repo-ci (${{ matrix.pathogen }})
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/release.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ on:
type: string
jobs:
run:
if: github.ref == github.event.repository.default_branch
if: github.ref_name == github.event.repository.default_branch
uses: ./.github/workflows/ci.yaml
secrets: inherit
with:
Expand Down
7 changes: 0 additions & 7 deletions .pylintrc
Original file line number Diff line number Diff line change
Expand Up @@ -317,13 +317,6 @@ max-line-length=100
# Maximum number of lines in a module
max-module-lines=1000

# List of optional constructs for which whitespace checking is disabled. `dict-
# separator` is used to allow tabulation in dicts, etc.: {1 : 1,\n222: 2}.
# `trailing-comma` allows a space between comma and closing bracket: (a, ).
# `empty-line` allows space-only lines.
no-space-check=trailing-comma,
dict-separator

# Allow the body of a class to be on the same line as the declaration if body
# contains single statement.
single-line-class-stmt=no
Expand Down
3 changes: 3 additions & 0 deletions .readthedocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,9 @@ build:
# generated from the full git history in conf.py.
- git fetch --unshallow

sphinx:
configuration: docs/conf.py

python:
install:
- method: pip
Expand Down
19 changes: 19 additions & 0 deletions CHANGES.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,9 +9,28 @@
### Bug Fixes

* ancestral, refine: Explicitly specify how the root and ambiguous states are handled during sequence reconstruction and mutation counting. [#1690][] (@rneher)
* titers: Fix type errors in code associated with cross-validation of models. [#1688][] (@huddlej)

[#1688]: https://github.com/nextstrain/augur/pull/1688
[#1690]: https://github.com/nextstrain/augur/pull/1690

## 27.0.0 (9 December 2024)

### Major Changes

- Drop support for Python 3.8. [#1693] (@victorlin)
- Drop support for older versions of jsonschema (<4.18.0). [#1691] (@victorlin)
- Drop support for xopen <2.0.0. [#1692] (@victorlin)

### Bug fixes

- export: validation will no longer crash with `KeyError: 'tree'` when newer versions of jsonschema (≥4.18.0) are installed. [#1358] (@victorlin)

[#1358]: https://github.com/nextstrain/augur/issues/1358
[#1691]: https://github.com/nextstrain/augur/pull/1691
[#1692]: https://github.com/nextstrain/augur/pull/1692
[#1693]: https://github.com/nextstrain/augur/pull/1693

## 26.2.0 (20 November 2024)

### Features
Expand Down
45 changes: 45 additions & 0 deletions CITATION.cff
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
cff-version: 1.2.0
message: "If you use this software, please cite it as below."

preferred-citation:
type: article
title: "Augur: a bioinformatics toolkit for phylogenetic analyses of human pathogens"
doi: "10.21105/joss.02906"
journal: "Journal of Open Source Software"
year: 2021
month: 1
volume: 6
issue: 57
start: 2906
end: 2906

authors:
- family-names: Huddleston
given-names: John

- family-names: Hadfield
given-names: James

- family-names: Sibley
given-names: Thomas R.

- family-names: Lee
given-names: Jover

- family-names: Fay
given-names: Kairsten

- family-names: Ilcisin
given-names: Misja

- family-names: Harkins
given-names: Elias

- family-names: Bedford
given-names: Trevor

- family-names: Neher
given-names: Richard A.

- family-names: Hodcroft
given-names: Emma B.
2 changes: 1 addition & 1 deletion DEPRECATED.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ available for backwards compatibility, but should not be used in new code.

## `xopen` major version 1

*Deprecated in version 25.1.0 (July 2024). Planned for removal November 2024 or after.*
*Deprecated in version 25.1.0 (July 2024). Removed in version 27.0.0 (December 2024).*

## `augur parse` preference of `name` over `strain` as the sequence ID field

Expand Down
2 changes: 2 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,8 @@ Try out an analysis of real virus data by [completing the Zika tutorial](https:/

Huddleston J, Hadfield J, Sibley TR, Lee J, Fay K, Ilcisin M, Harkins E, Bedford T, Neher RA, Hodcroft EB, (2021). Augur: a bioinformatics toolkit for phylogenetic analyses of human pathogens. Journal of Open Source Software, 6(57), 2906, https://doi.org/10.21105/joss.02906

For other formats, refer to [CITATION.cff](./CITATION.cff).

## License and copyright

Copyright 2014-2022 Trevor Bedford and Richard Neher.
Expand Down
6 changes: 1 addition & 5 deletions augur/__main__.py
Original file line number Diff line number Diff line change
Expand Up @@ -24,11 +24,7 @@ def main():
errors="backslashreplace",
newline=None,

# Always line-buffer stderr since we only use it for messaging, not
# data output. This is the Python default from 3.9 onwards, but we
# also run on 3.8 where it's not. Be consistent regardless of Python
# version.
line_buffering=True,
# By default, stderr is always line-buffered.
)

return augur.run( argv[1:] )
Expand Down
2 changes: 1 addition & 1 deletion augur/__version__.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
__version__ = '26.2.0'
__version__ = '27.0.0'


def is_augur_version_compatible(version):
Expand Down
16 changes: 2 additions & 14 deletions augur/io/file.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,21 +2,9 @@
from contextlib import contextmanager
from io import IOBase
from textwrap import dedent
from xopen import xopen
from xopen import xopen, _PipedCompressionProgram
from augur.errors import AugurError

# Workaround to maintain compatibility with both xopen v1 and v2
# Around November 2024, we shall drop support for xopen v1
# by removing the try-except block and using
# _PipedCompressionProgram directly
try:
from xopen import _PipedCompressionProgram as PipedCompressionReader
from xopen import _PipedCompressionProgram as PipedCompressionWriter
except ImportError:
from xopen import ( # type: ignore[attr-defined, no-redef]
PipedCompressionReader,
PipedCompressionWriter,
)

ENCODING = "utf-8"

Expand Down Expand Up @@ -63,7 +51,7 @@ def open_file(path_or_buffer, mode="r", **kwargs):
Try re-saving the file using the {e.encoding!r} encoding."""))


elif isinstance(path_or_buffer, (IOBase, PipedCompressionReader, PipedCompressionWriter)):
elif isinstance(path_or_buffer, (IOBase, _PipedCompressionProgram)):
yield path_or_buffer

else:
Expand Down
72 changes: 37 additions & 35 deletions augur/titer_model.py
Original file line number Diff line number Diff line change
Expand Up @@ -38,42 +38,42 @@ def load_from_file(filenames, excluded_sources=None):
>>> type(measurements)
<class 'dict'>
>>> len(measurements)
11
248
>>> len(strains)
13
62
>>> len(sources)
5
15
Inspect specific measurements. First, inspect a measurement that has a
specific value in the input.
>>> measurements[("A/Acores/11/2013", ("A/Alabama/5/2010", "F27/10"))]
[80.0]
>>> measurements[("A/Wisconsin/3/2007", ("A/Wisconsin/3/2007", "A/Wis3/07"))]
[5120.0]
Next, inspect a measurement that has a thresholded value at the lower
bound of detection (e.g., "<80"). This measurement should be reported as
one half of its threshold value (e.g., 40.0).
bound of detection (e.g., "<40"). This measurement should be reported as
one half of its threshold value (e.g., 20.0).
>>> measurements[("A/Acores/11/2013", ("A/Victoria/208/2009", "F7/10"))]
[40.0]
>>> measurements[("A/HongKong/1/1968", ("A/Victoria/3/1975", "A/Vic/3/75"))]
[20.0]
Inspect a measurement that has a thresholded value at the upper bound of
detection (">1280"). This measurement should be reported as twice its
threshold value (e.g., 2560.0).
detection (">5120"). This measurement should be reported as twice its
threshold value (e.g., 10240.0).
>>> measurements[("A/Acores/SU43/2012", ("A/Texas/50/2012", "F36/12"))]
[2560.0]
>>> measurements[("A/Wisconsin/3/2007", ("A/Uruguay/716/2007", "A/Uru716/07"))]
[10240.0]
Confirm that excluding sources produces fewer measurements.
>>> measurements, strains, sources = TiterCollection.load_from_file("tests/data/titer_model/h3n2_titers_subset.tsv", excluded_sources=["NIMR_Sep2013_7-11.csv"])
>>> measurements, strains, sources = TiterCollection.load_from_file("tests/data/titer_model/h3n2_titers_subset.tsv", excluded_sources=["Hay2001"])
>>> len(measurements)
5
223
Request measurements for a test/reference/serum tuple that should not
exist after excluding its source.
>>> measurements.get(("A/Acores/11/2013", ("A/Alabama/5/2010", "F27/10")))
>>> measurements.get(("A/HongKong/1/1968", ("A/HongKong/1/1968", "A/HK/1/68")))
>>>
Missing titer data should produce an error.
Expand Down Expand Up @@ -150,12 +150,10 @@ def count_strains(titers):
--------
>>> measurements, strains, sources = TiterCollection.load_from_file("tests/data/titer_model/h3n2_titers_subset.tsv")
>>> titer_counts = TiterCollection.count_strains(measurements)
>>> titer_counts["A/Acores/11/2013"]
6
>>> titer_counts["A/Acores/SU43/2012"]
3
>>> titer_counts["A/Cairo/63/2012"]
2
>>> titer_counts["A/Auckland/6/2003"]
4
>>> titer_counts["A/Brisbane/9/2006"]
15
"""
counts = defaultdict(int)
for key in titers.keys():
Expand Down Expand Up @@ -187,22 +185,26 @@ def filter_strains(titers, strains):
--------
>>> measurements, strains, sources = TiterCollection.load_from_file("tests/data/titer_model/h3n2_titers_subset.tsv")
>>> len(measurements)
11
248
Test the case when a test strain exists in the subset but the none of
its corresponding reference strains do.
>>> len(TiterCollection.filter_strains(measurements, ["A/Acores/11/2013"]))
>>> len(TiterCollection.filter_strains(measurements, ["A/Oslo/244/1997"]))
0
Test when both the test and reference strains exist in the subset.
Test when both the test and reference strains exist in the subset. This
first test gets a heterologous pair (first and second strain) and the
autologous pair for the second strain.
>>> len(TiterCollection.filter_strains(measurements, ["A/Acores/11/2013", "A/Alabama/5/2010", "A/Athens/112/2012"]))
>>> len(TiterCollection.filter_strains(measurements, ["A/Oslo/244/1997", "A/Johannesburg/33/1994"]))
2
>>> len(TiterCollection.filter_strains(measurements, ["A/Acores/11/2013", "A/Acores/SU43/2012", "A/Alabama/5/2010", "A/Athens/112/2012"]))
3
Test when no strains are provided.
>>> len(TiterCollection.filter_strains(measurements, []))
0
"""
return {key: value for key, value in titers.items()
if key[0] in strains and key[1][0] in strains}
Expand All @@ -226,7 +228,7 @@ def __init__(self, titers, **kwargs):
else:
self.titers = titers
strain_counts = type(self).count_strains(titers)
self.strains = strain_counts.keys()
self.strains = list(strain_counts.keys())

def read_titers(self, fname):
self.titer_fname = fname
Expand Down Expand Up @@ -318,11 +320,11 @@ def strain_census(self, titers):
>>> titers = TiterCollection(measurements)
>>> sera, ref_strains, test_strains = titers.strain_census(measurements)
>>> len(sera)
9
66
>>> len(ref_strains)
9
27
>>> len(test_strains)
13
62
Parameters
----------
Expand Down Expand Up @@ -415,7 +417,7 @@ def make_training_set(self, training_fraction=1.0, subset_strains=False, **kwarg
from random import sample
tmp = set(self.test_strains)
tmp.difference_update(self.ref_strains) # don't use references viruses in the set to sample from
training_strains = sample(tmp, int(training_fraction*len(tmp)))
training_strains = sample(sorted(tmp), int(training_fraction*len(tmp)))
for tmpstrain in self.ref_strains: # add all reference viruses to the training set
if tmpstrain not in training_strains:
training_strains.append(tmpstrain)
Expand Down Expand Up @@ -504,7 +506,7 @@ def validate(self, plot=False, cutoff=0.0, validation_set = None, fname=None):
pred_titer = self.predict_titer(key[0], key[1], cutoff=cutoff)
validation[key] = (val, pred_titer)

validation_array = np.array(validation.values())
validation_array = np.array(list(validation.values()))
actual = validation_array[:,0]
predicted = validation_array[:,1]

Expand All @@ -517,7 +519,7 @@ def validate(self, plot=False, cutoff=0.0, validation_set = None, fname=None):
'rms_error': np.sqrt(np.mean((actual-predicted)**2)),
}
pprint(model_performance)
model_performance['values'] = validation.values()
model_performance['values'] = list(validation.values())

self.validation = model_performance

Expand Down
2 changes: 0 additions & 2 deletions augur/util_support/node_data.py
Original file line number Diff line number Diff line change
Expand Up @@ -31,8 +31,6 @@ def deep_add_or_update(self, d, key, value):
raise exception
"""

# TODO Python 3.9: Use the new dictionary union operator (https://www.python.org/dev/peps/pep-0584/)

if key not in d or (
not isinstance(d[key], dict) and not isinstance(value, dict)
):
Expand Down
Loading

0 comments on commit f556cc9

Please sign in to comment.