Skip to content

Commit

Permalink
Add Getting Started Vignette
Browse files Browse the repository at this point in the history
  • Loading branch information
jamesmbaazam committed Nov 24, 2023
1 parent b34a3a3 commit 69bf48d
Showing 1 changed file with 153 additions and 0 deletions.
153 changes: 153 additions & 0 deletions vignettes/EpiNow2.Rmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,153 @@
---
title: "Getting started with EpiNow2"
output:
rmarkdown::html_vignette:
toc: false
number_sections: false
bibliography: library.bib
csl: https://raw.githubusercontent.com/citation-style-language/styles/master/apa-numeric-superscript-brackets.csl
vignette: >
%\VignetteIndexEntry{Getting started with EpiNow2}
%\VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
---

## Quick start

In the following section we give an overview of the simple use case for `epinow()` and `regional_epinow()`.

The first step to using the package is to load it as follows.

```{r, message = FALSE}
library(EpiNow2)
```

### Reporting delays, incubation period and generation time

Distributions can be supplied in two ways. First, by supplying delay data to `estimate_delay()`, where a subsampled bootstrapped lognormal will be fit to account for uncertainty in the observed data without being biased by changes in incidence (see `?EpiNow2::estimate_delay()`).

Second, by specifying predetermined delays with uncertainty using `dist_spec()`. An arbitrary number of delay distributions are supported in `dist_spec()` with the most common use case likely to be an incubation period followed by a reporting delay (see `?EpiNow2::dist_spec()`).

For example if data on the delay between onset and infection was available we could fit a distribution to it, using `estimate_delay()`, with appropriate uncertainty as follows (note this is a synthetic example),
```{r, eval = FALSE}
reporting_delay <- estimate_delay(
rlnorm(1000, log(2), 1),
max_value = 14, bootstraps = 1
)
```

If data was not available we could instead specify an informed estimate of the likely delay
using `dist_spec()`. To demonstrate, we choose a lognormal distribution with mean 2, standard deviation 1 (both with some uncertainty) and a maximum of 10. *This is just an example and unlikely to apply in any particular use case*.

```{r}
reporting_delay <- dist_spec(
mean = convert_to_logmean(2, 1),
mean_sd = 0.1,
sd = convert_to_logsd(2, 1),
sd_sd = 0.1,
max = 10,
dist = "lognormal"
)
reporting_delay
```

For the rest of this vignette, we will use inbuilt example literature estimates for the incubation period and generation time of Covid-19 (see [here](https://github.com/epiforecasts/EpiNow2/tree/main/data-raw) for the code that generates these estimates). *These distributions are unlikely to be applicable for your use case. We strongly recommend investigating what might be the best distributions to use in any given use case.*

```{r}
example_generation_time
example_incubation_period
```

Now, to the functions.

### [epinow()](https://epiforecasts.io/EpiNow2/reference/epinow.html)

This function represents the core functionality of the package and includes results reporting, plotting, and optional saving. It requires a data frame of cases by date of report and the distributions defined above.

Load example case data from `{EpiNow2}`.

```{r}
reported_cases <- example_confirmed[1:60]
head(reported_cases)
```

Estimate cases by date of infection, the time-varying reproduction number, the rate of growth, and forecast these estimates into the future by 7 days. Summarise the posterior and return a summary table and plots for reporting purposes. If a `target_folder` is supplied results can be internally saved (with the option to also turn off explicit returning of results). Here we use the default model parameterisation that prioritises real-time performance over run-time or other considerations. For other formulations see the documentation for `estimate_infections()`.

```{r, message = FALSE, warning = FALSE}
estimates <- epinow(
reported_cases = reported_cases,
generation_time = generation_time_opts(example_generation_time),
delays = delay_opts(example_incubation_period + reporting_delay),
rt = rt_opts(prior = list(mean = 2, sd = 0.2)),
stan = stan_opts(cores = 4, control = list(adapt_delta = 0.99)),
verbose = interactive()
)
names(estimates)
```

Both summary measures and posterior samples are returned for all parameters in an easily explored format which can be accessed using `summary`. The default is to return a summary table of estimates for key parameters at the latest date partially supported by data.

```{r}
knitr::kable(summary(estimates))
```

Summarised parameter estimates can also easily be returned, either filtered for a single parameter or for all parameters.

```{r}
head(summary(estimates, type = "parameters", params = "R"))
```

Reported cases are returned in a separate data frame in order to streamline the reporting of forecasts and for model evaluation.

```{r}
head(summary(estimates, output = "estimated_reported_cases"))
```

A range of plots are returned (with the single summary plot shown below). These plots can also be generated using the following `plot` method.

```{r, dpi = 330, fig.width = 12, fig.height = 12, message = FALSE, warning = FALSE}
plot(estimates)
```


### [regional_epinow()](https://epiforecasts.io/EpiNow2/reference/regional_epinow.html)

The `regional_epinow()` function runs the `epinow()` function across multiple regions in
an efficient manner.

Define cases in multiple regions delineated by the region variable.

```{r}
reported_cases <- data.table::rbindlist(list(
data.table::copy(reported_cases)[, region := "testland"],
reported_cases[, region := "realland"]
))
head(reported_cases)
```

Calling `regional_epinow()` runs the `epinow()` on each region in turn (or in parallel depending on the settings used). Here we switch to using a weekly random walk rather than the full Gaussian process model giving us piecewise constant estimates by week.

```{r, message = FALSE, warning = FALSE}
estimates <- regional_epinow(
reported_cases = reported_cases,
generation_time = generation_time_opts(example_generation_time),
delays = delay_opts(example_incubation_period + reporting_delay),
rt = rt_opts(prior = list(mean = 2, sd = 0.2), rw = 7),
gp = NULL,
stan = stan_opts(cores = 4, warmup = 250, samples = 1000)
)
```

Results from each region are stored in a `regional` list with across region summary measures and plots stored in a `summary` list. All results can be set to be internally saved by setting the `target_folder` and `summary_dir` arguments. Each region can be estimated in parallel using the `{future}` package (when in most scenarios `cores` should be set to 1). For routine use each MCMC chain can also be run in parallel (with `future` = TRUE) with a time out (`max_execution_time`) allowing for partial results to be returned if a subset of chains is running longer than expected. See the documentation for the `{future}` package for details on nested futures.

Summary measures that are returned include a table formatted for reporting (along with raw results for further processing). Futures updated will extend the S3 methods used above to smooth access to this output.

```{r}
knitr::kable(estimates$summary$summarised_results$table)
```

A range of plots are again returned (with the single summary plot shown below).

```{r, dpi = 330, fig.width = 12, fig.height = 12, message = FALSE, warning = FALSE}
estimates$summary$summary_plot
```

0 comments on commit 69bf48d

Please sign in to comment.