Skip to content

Commit

Permalink
update news
Browse files Browse the repository at this point in the history
  • Loading branch information
sbfnk committed Oct 3, 2023
1 parent da970f5 commit 0d796d7
Show file tree
Hide file tree
Showing 5 changed files with 26 additions and 25 deletions.
2 changes: 1 addition & 1 deletion NEWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

## Documentation

* Two new vignettes have been added to cover the workflow and example uses
* Two new vignettes have been added to cover the workflow and example uses. By @sbfnk in #458 and reviewed by @jamesmbaazam.

# EpiNow2 1.4.0

Expand Down
2 changes: 1 addition & 1 deletion _pkgdown.yml
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,7 @@ navbar:
href: articles/estimate_infections_workflow.html
- text: Examples: estimate_infections()
href: articles/estimate_infections_options.html
- text: epinow(): production mode
- text: Using epinow() for running in production mode
href: articles/epinow.html
casestudies:
text: Case studies
Expand Down
12 changes: 6 additions & 6 deletions vignettes/epinow.Rmd.orig
Original file line number Diff line number Diff line change
@@ -1,13 +1,13 @@
---
title: "epinow(): production mode"
title: "Using epinow() for running in production mode"
output:
rmarkdown::html_vignette:
toc: false
number_sections: false
bibliography: library.bib
csl: https://raw.githubusercontent.com/citation-style-language/styles/master/apa-numeric-superscript-brackets.csl
vignette: >
%\VignetteIndexEntry{epinow(): production mode}
%\VignetteIndexEntry{Using epinow() for running in production mode}
%\VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
---
Expand All @@ -22,13 +22,13 @@ knitr::opts_chunk$set(
```

The _EpiNow2_ package contains functionality to run `estimate_infections()` in production mode, i.e. with full logging and saving all relevant outputs and plots to dedicated folders in the hard drive.
This is done with the `epinow()` function, that takes the same options as `estimate_infections()` with some additional infections that determine, for example, where output gets stored and what output exactly.
This is done with the `epinow()` function, that takes the same options as `estimate_infections()` with some additional options that determine, for example, where output gets stored and what output exactly.
The function can be a useful option when, e.g., running the model daily with updated data on a high-performance computing server to feed into a dashboard.
For more detail on the various model options available, see the [Examples](estimate_infections_options.html) vignette, for more on the general modelling approach the [Workflow](estimate_infections_workflow.html), and for theoretical background the [Model definitions](estimate_infections.html) vignette

# Running the model on a single region

To run the model in production model for a single region, set the parameters up in the same way as for `estimate_infections()` (see the [Workflow](estimate_infections_workflow.html) vignette).
To run the model in production mode for a single region, set the parameters up in the same way as for `estimate_infections()` (see the [Workflow](estimate_infections_workflow.html) vignette).
Here we use the example delay and generation time distributions that come with the package.
This should be replaced with parameters relevant to the system that is being studied.

Expand Down Expand Up @@ -62,14 +62,14 @@ res <- epinow(reported_cases,
res$plots$R
```

The initial messages here indicate where log files can be fund, and summarised results and plots are in the folder given by `target_folder` (here: `results/`).
The initial messages here indicate where log files can be found, and summarised results and plots are in the folder given by `target_folder` (here: `results/`).

# Running the model simultaneously on multiple regions

The package also contains functionality to conduct inference contemporaneously (if separately) in production mode on multiple time series, e.g. to run the model on multiple regions.
This is done with the `regional_epinow()` function.

Say, for example, we construct a data sets containing two regions, `testland` and `realland` (in this simple example both containing the same case data).
Say, for example, we construct a dataset containing two regions, `testland` and `realland` (in this simple example both containing the same case data).

```{r construct_regional_cases}
cases <- example_confirmed[1:60]
Expand Down
8 changes: 4 additions & 4 deletions vignettes/estimate_infections_options.Rmd.orig
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,7 @@ options(mc.cores = 4)

# Data

As data set we will use an example data set that is included in the package, representing an outbreak of COVID-19 with an initial rapid increase followed by decreasing incidence.
We will use an example data set that is included in the package, representing an outbreak of COVID-19 with an initial rapid increase followed by decreasing incidence.

```{r data, fig.height = 4}
library("ggplot2")
Expand All @@ -70,8 +70,8 @@ incubation_period <- get_incubation_period(
incubation_period
```

For the reporting delay, we use a lognormal distribution with mean of 2 days and
standard deviation of 1 day.
For the reporting delay, we use a lognormal distribution with mean of 2 days and standard deviation of 1 day.
Note that the mean and standard deviation must be converted to the log scale, which can be done using the `convert_log_logmean()` function.

```{r reporting_delay}
reporting_delay <- dist_spec(
Expand All @@ -81,7 +81,7 @@ reporting_delay <- dist_spec(
reporting_delay
```

We can combine these delays into one by summing them up
_EpiNow2_ provides a feature that allows us to combine these delays into one by summing them up

```{r delay}
delay <- incubation_period + reporting_delay
Expand Down
27 changes: 14 additions & 13 deletions vignettes/estimate_infections_workflow.Rmd.orig
Original file line number Diff line number Diff line change
Expand Up @@ -20,13 +20,13 @@ knitr::opts_chunk$set(
)
```

In this vignette we describe the typical workflow by which someone might obtain reproduction number estimates and short-term forecasts for a given disease spreading in a given setting.
The vignette uses the default model included in the package.
This vignette describes the typical workflow for estimating reproduction numbers and performing short-term forecasts for a disease spreading in a given setting using _EpiNow2_.
The vignette uses the default non-stationary Gaussian process model included in the package.
See other vignettes for a more thorough exploration of [alternative model variants](estimate_infections_options.html) and [theoretical background](estimate_infections.html).

# Data

Obtaining a good and full understanding of the data being used an important first step in any inference procedure such as the one applied here.
Obtaining a good and full understanding of the data being used is an important first step in any inference procedure such as the one applied here.
_EpiNow2_ expects data in the format of a data frame with two columns, `date` and `confirm`, where `confirm` stands for the number of confirmed counts - although in reality this can be applied to any data including suspected cases and lab-confirmed outcomes.
The user might already have the data as such a time series provided, for example, on public dashboards or directly from public health authorities.
Alternatively, they can be constructed from individual-level data, for example using the [incidence2](https://cran.r-project.org/web/packages/incidence2/index.html) R package.
Expand All @@ -49,12 +49,14 @@ We first load the _EpiNow2_ package.
library("EpiNow2")
```

We then set the number of cores to use. We will want to run 4 MCMC chains in parallel so we set this to 4. If we had fewer than 4 available or wanted to run fewer than 4 chains (at the expense of some robustness), or had fewer than 4 computing cores available we could set it to that. To find out the number of cores available one can use the [detectCores](https://rdrr.io/r/parallel/detectCores.html) function from the `parallel` package.
We then set the number of cores to use. We will want to run 4 MCMC chains in parallel so we set this to 4.

```{r}
options(mc.cores = 4)
```

If we had fewer than 4 available or wanted to run fewer than 4 chains (at the expense of some robustness), or had fewer than 4 computing cores available we could set it to that. To find out the number of cores available one can use the [detectCores](https://rdrr.io/r/parallel/detectCores.html) function from the `parallel` package.

# Parameters

Once a data set has been identified, a number of relevant parameters need to be considered before using _EpiNow2_.
Expand All @@ -67,7 +69,7 @@ They are defined using a common interface with the `dist_spec()` function.
For help with this function, see its manual page

```{r eval = FALSE}
?dist_spec
?EpiNow2::dist_spec
```

In all cases, the distributions given can be *fixed* (i.e. have no uncertainty) or *variable* (i.e. have associated uncertainty).
Expand All @@ -88,15 +90,15 @@ dist_spec(

There are various ways the specific delay distributions mentioned below might be obtained.
Often, they will come directly from the existing literature reviewed by the user and studies conducted elsewhere.
Sometimes it might be possible to obatined them from existing databases, e.g. using the [epiparameter](https://github.com/epiverse-trace/epiparameter) R package.
Sometimes it might be possible to obtain them from existing databases, e.g. using the [epiparameter](https://github.com/epiverse-trace/epiparameter) R package.
Alternatively they might be obtainable from raw data, e.g. linelists.
The _EpiNow2_ package contains functionality for estimating delay distributions from observed delays in the `estimate_delays()` function.
For a more comprehensive treatment of delays and their estimation avoiding common biases one can consider, for example, the [dynamicaltruncation](https://github.com/parksw3/epidist-paper) R package and associated paper.

### Generation intervals

The generation interval is a delay distribution that describes the amount of time that passes between an individual becoming infected and infecting someone else.
In _EpiNow2_, the generation time distribution is defined by a call to `generation_time_opts()`, a function that takes a single argument defined as a `dist_spec`.
In _EpiNow2_, the generation time distribution is defined by a call to `generation_time_opts()`, a function that takes a single argument defined as a `dist_spec` object (returned by `dist_spec()`).
For example, to define the generation time as gamma distributed with uncertain mean centered on 3 (sd: 2) and sd centered on 1 (sd: 0.1), a maximum value of 10 and weighted by the number of case data points we would use

```{r, results = 'hide'}
Expand Down Expand Up @@ -126,7 +128,7 @@ reporting_delay <- dist_spec(
incubation_period + reporting_delay
```

In _EpiNow2_, the reporting delay distribution is defined by a call to `delay_opts()`, a function that takes a single argument defined as a `dist_spec`.
In _EpiNow2_, the reporting delay distribution is defined by a call to `delay_opts()`, a function that takes a single argument defined as a `dist_spec` object (returned by `dist_spec()`).
For example, if our observations were by symptom onset we would use

```{r, results = 'hide'}
Expand All @@ -143,7 +145,7 @@ delay_opts(delay)
### Truncation

Besides the delay from infection to the event that is recorded in the data, there can also be a delay from that event to being recorded in the data.
For example, data by symptom onset may only enter the data once lab confirmation has occurred, or even a day or two after that confirmation.
For example, data reported by symptom onset may only become part of the dataset once lab confirmation has occurred, or even a day or two after that confirmation.
Statistically, this means our data is right-truncated.
In practice, it means that recent data will be unlikely to be complete.

Expand All @@ -152,7 +154,6 @@ One can then use methods that use the amount of backfilling that occurred 1, 2,
In _EpiNow2_, this can be done using the `estimate_truncation()` method which returns, amongst others, posterior estimates of the truncation distribution.
For more details on the model used for this, see the [estimate_truncation](estimate_truncation.html) vignette.


```{r eval = FALSE}
?estimate_truncation
```
Expand All @@ -161,7 +162,7 @@ In the `estimate_infections()` function, the truncation distribution is defined
This will then be used to correct for right truncation in the data.

The separation of estimation of right truncation on the one hand and estimation of the reproduction number on the other may be attractive for practical purposes but is questionable statistically as it separates two processes that are not strictly separable, potentially introducing a bias.
For an alternative approach where these are estimated jointly that is being developed by some of the package authors of _EpiNow2_ with collaborators, see the [epinowcast](https://package.epinowcast.org/) package.
An alternative approach where these are estimated jointly is being implemented in the [epinowcast](https://package.epinowcast.org/) package, which is being developed by the _EpiNow2_ developers with collaborators.

## Completeness of reporting

Expand All @@ -181,7 +182,7 @@ The default model that `estimate_infections()` uses to estimate reproduction num
This represents the user's initial belief of the value of the reproduction number, where there is no data yet to inform its value.
By default this is assumed to be represented by a lognormal distribution with mean and standard deviation of 1.
It can be changed using the `rt_opts()` function.
For example, if the users believes that at the very start of the data the reproduction number was 2, with uncertainty in this belief represented by a standard deviation of 1, they would used
For example, if the user believes that at the very start of the data the reproduction number was 2, with uncertainty in this belief represented by a standard deviation of 1, they would use

```{r results = 'hide'}
rt_prior <- list(mean = 2, sd = 1)
Expand All @@ -208,7 +209,7 @@ def <- estimate_infections(
)
```

Alternatively, for production environments the `epinow` function can be used that uses `estimate_infections` internally but adds functionality for logging and saving results and plots in dedicated places in the user's file system.
Alternatively, for production environments, we recommend using the `epinow()` function. It uses `estimate_infections()` internally and provides functionality for logging and saving results and plots in dedicated directories in the user's file system.

## Forecasting secondary outcomes

Expand Down

0 comments on commit 0d796d7

Please sign in to comment.