Restructure README and add Getting Started vignette (#521)

* Restructure README * Add Getting Started Vignette * Automatic readme update * Precompile Getting Started vignette * Fix partial matching * Use relative links * Restructure models section * Wrap package name in braces * Automatic readme update * Format package name --------- Co-authored-by: GitHub Action <[email protected]>
epiforecasts · Jan 9, 2024 · bcf297c · bcf297c
1 parent 0a8569e
commit bcf297c
Show file tree

Hide file tree

Showing 6 changed files with 641 additions and 459 deletions.
diff --git a/README.Rmd b/README.Rmd
@@ -17,32 +17,54 @@ knitr::opts_chunk$set(
 
 [![MIT license](https://img.shields.io/badge/License-MIT-blue.svg)](https://github.com/epiforecasts/EpiNow2/blob/main/LICENSE.md/)   [![GitHub contributors](https://img.shields.io/github/contributors/epiforecasts/EpiNow2)](https://github.com/epiforecasts/EpiNow2/graphs/contributors)  [![universe](https://epiforecasts.r-universe.dev/badges/EpiNow2)](http://epiforecasts.r-universe.dev/ui/#package:EpiNow2) [![GitHub commits](https://img.shields.io/github/commits-since/epiforecasts/EpiNow2/v1.4.0.svg?color=orange)](https://GitHub.com/epiforecasts/EpiNow2/commit/main/) [![DOI](https://zenodo.org/badge/272995211.svg)](https://zenodo.org/badge/latestdoi/272995211)
 
-This package estimates the time-varying reproduction number, growth rate, and doubling time using a range of open-source tools ([Abbott et al.](https://doi.org/10.12688/wellcomeopenres.16006.1)), and current best practices ([Gostic et al.](https://doi.org/10.1371/journal.pcbi.1008409)). It aims to help users avoid some of the limitations of naive implementations in a framework that is informed by community feedback and is actively supported. 
+## Summary
 
-It estimates the time-varying reproduction number on cases by date of infection (using a similar approach to that implemented in [`{EpiEstim}`](https://github.com/mrc-ide/EpiEstim)). Imputed infections are then mapped to observed data (for example cases by date of report) via a series of uncertain delay distributions (in the examples in the package documentation these are an incubation period and a reporting delay) and a reporting model that can include weekly periodicity. 
+`{EpiNow2}` estimates the time-varying reproduction number, growth rate, and doubling time using a range of open-source tools ([Abbott et al.](https://doi.org/10.12688/wellcomeopenres.16006.1)), and current best practices ([Gostic et al.](https://doi.org/10.1371/journal.pcbi.1008409)). It aims to help users avoid some of the limitations of naive implementations in a framework that is informed by community feedback and is actively supported.
+
+Forecasting is also supported for the time-varying reproduction number, infections, and reported cases using the same generative process approach as used for estimation.
+
+<details> <summary> More details </summary>
+
+`{EpiNow2}` estimates the time-varying reproduction number on cases by date of infection (using a similar approach to that implemented in [`{EpiEstim}`](https://github.com/mrc-ide/EpiEstim)). Imputed infections are then mapped to observed data (for example cases by date of report) via a series of uncertain delay distributions (in the examples in the package documentation these are an incubation period and a reporting delay) and a reporting model that can include weekly periodicity. 
 
 Uncertainty is propagated from all inputs into the final parameter estimates, helping to mitigate spurious findings. This is handled internally. The time-varying reproduction estimates and the uncertain generation time also give time-varying estimates of the rate of growth.
 
-The default model uses a non-stationary Gaussian process to estimate the time-varying reproduction number and then infer infections. Other options include:
+</details>
+
+<details> <summary> Models provided </summary>
+
+`{EpiNow2}` provides three models:
+
+* [`estimate_infections()`](reference/estimate_infections.html): Reconstruct cases by date of infection from reported cases.
+
+* [`estimate_secondary()`](reference/estimate_secondary.html): Estimate the relationship between primary and secondary observations, for example, deaths (secondary) based on hospital admissions (primary), or bed occupancy (secondary) based on hospital admissions (primary).
+
+* [`estimate_truncation()`](reference/estimate_truncation.html): Estimate a truncation distribution from multiple snapshots of the same data source over time. For more flexibility, check out the [`{epinowcast}`](https://package.epinowcast.org/) package.
+
+
+The default model in `estimate_infections()` uses a non-stationary Gaussian process to estimate the time-varying reproduction number and infer infections. Other options, which generally reduce runtimes at the cost of the granularity of estimates or real-time performance, include:
 
 * A stationary Gaussian process (faster to estimate but currently gives reduced performance for real time estimates).
 * User specified breakpoints.
 * A fixed reproduction number.
-* As piecewise constant by combining a fixed reproduction number with breakpoints.
-* As a random walk (by combining a fixed reproduction number with regularly spaced breakpoints (i.e weekly)).
-* Inferring infections using deconvolution/back-calculation and then calculating the time-varying reproduction number.
+* A piecewise constant, combining a fixed reproduction number with breakpoints.
+* A random walk, combining a fixed reproduction number with regularly spaced breakpoints (i.e weekly).
+* A deconvolution/back-calculation method for inferring infections, followed with calculating the time-varying reproduction number.
 * Adjustment for the remaining susceptible population beyond the forecast horizon.
 
-These options generally reduce runtimes at the cost of the granularity of estimates or at the cost of real-time performance.
+The documentation for [`estimate_infections`](reference/estimate_infections.html) provides examples of the implementation of the different options available. 
 
-The documentation for [`estimate_infections`](https://epiforecasts.io/EpiNow2/reference/estimate_infections.html) provides examples of the implementation of the different options available. 
+`{EpiNow2}` is designed to be used via a single function call to two functions:
 
-Forecasting is also supported for the time-varying reproduction number, infections and reported cases using the same generative process approach as used for estimation.
+* [`epinow()`](reference/epinow.html): Estimate Rt and cases by date of infection and forecast these infections into the future.
 
-A simple example of using the package to estimate a national Rt for Covid-19 can be found [here](https://gist.github.com/seabbs/163d0f195892cde685c70473e1f5e867).
+* [`regional_epinow()`](reference/regional_epinow.html): Efficiently run `epinow()` across multiple regions in an efficient manner.
 
+These two functions call [`estimate_infections()`](reference/estimate_infections.html), which works to reconstruct cases by date of infection from reported cases.
 
-`EpiNow2` also supports adjustment for truncated data via `estimate_truncation()` (though users may be interested in more flexibility and if so should check out the [`epinowcast`](https://package.epinowcast.org/) package), and for estimating dependent observations (i.e deaths based on hospital admissions) using `estimate_secondary()`.
+For more details on using each function see the [function documentation](reference/index.html).
+
+</details>
 
 ## Installation
 
@@ -80,147 +102,54 @@ remotes::install_github("epiforecasts/EpiNow2")
 
 Windows users will need a working installation of Rtools in order to build the package from source. See [here](https://github.com/stan-dev/rstan/wiki/RStan-Getting-Started#checking-the-c-toolchain) for a guide to installing Rtools for use with Stan (which is the statistical modelling platform used for the underlying model). For simple deployment/development a prebuilt docker image is also available (see documentation [here](https://github.com/epiforecasts/EpiNow2/wiki/Docker)).
 
-## Quick start
-
-`{EpiNow2}` is designed to be used with a single function call or to be
-used in an ad-hoc fashion via individual function calls. The core functions of `{EpiNow2}` are the two single-call functions [`epinow()`](https://epiforecasts.io/EpiNow2/reference/epinow.html), [`regional_epinow()`](https://epiforecasts.io/EpiNow2/reference/regional_epinow.html), plus functions
-[`estimate_infections()`](https://epiforecasts.io/EpiNow2/reference/estimate_infections.html), [`estimate_secondary()`](https://epiforecasts.io/EpiNow2/reference/estimate_secondary.html) and [`estimate_truncation()`](https://epiforecasts.io/EpiNow2/reference/estimate_truncation.html). In the following section we give an overview of the simple use case for `epinow` and `regional_epinow`. [`estimate_infections()`](https://epiforecasts.io/EpiNow2/reference/estimate_infections.html) can be used on its own to infer the underlying infection case curve from reported cases and estimate Rt. Estimating the underlying infection case curve via back-calculation (and then calculating Rt) is substantially less computationally demanding than generating using default settings but may result in less reliable estimates of Rt. For more details on using each function see the [function documentation](https://epiforecasts.io/EpiNow2/reference/index.html).
-
-The first step to using the package is to load it as follows.
-
-```{r, message = FALSE}
-library(EpiNow2)
-```
-
-### Reporting delays, incubation period and generation time
-
-Distributions can either be fitted using package functionality or determined elsewhere and then defined with uncertainty for use in `{EpiNow2}`. When data is supplied a subsampled bootstrapped lognormal will be fit (to account for uncertainty in the observed data without being biased by changes in incidence). An arbitrary number of delay distributions are supported with the most common use case likely to be a incubation period followed by a reporting delay.
-
-For example if data on the delay between onset and infection was available we could fit a distribution to it with appropriate uncertainty as follows (note this is a synthetic example),
-```{r, eval = FALSE}
-reporting_delay <- estimate_delay(
-  rlnorm(1000, log(2), 1),
-  max_value = 14, bootstraps = 1
-)
-```
-
-If data was not available we could instead make an informed estimate of the likely delay. We choose a lognormal distribution with mean 2, standard deviation 1 (both with some uncertainty) and a maximum of 10. *This is just an example and unlikely to apply in any particular use case*.
-
-```{r}
-reporting_delay <- dist_spec(
-  mean = convert_to_logmean(2, 1),
-  mean_sd = 0.1,
-  sd = convert_to_logsd(2, 1),
-  sd_sd = 0.1,
-  max = 10,
-  dist = "lognormal"
-)
-reporting_delay
-```
-
-We use example literature estimates for the incubation period and generation time of Covid-19 (see [here](https://github.com/epiforecasts/EpiNow2/tree/main/data-raw) for the code that generates these estimates). *These distributions are unlikely to be applicable for your use case. We strongly recommend investigating what might be the best distributions to use in any given use case.*
+## Resources
 
-```{r}
-example_generation_time
-example_incubation_period
-```
-
-### [epinow()](https://epiforecasts.io/EpiNow2/reference/epinow.html) 
-
-This function represents the core functionality of the package and includes results reporting, plotting and optional saving. It requires a data frame of cases by date of report and the distributions defined above.
-
-Load example case data from `{EpiNow2}`.
-
-```{r}
-reported_cases <- example_confirmed[1:60]
-head(reported_cases)
-```
-
-Estimate cases by date of infection, the time-varying reproduction number, the rate of growth and forecast these estimates into the future by 7 days. Summarise the posterior and return a summary table and plots for reporting purposes. If a `target_folder` is supplied results can be internally saved (with the option to also turn off explicit returning of results). Here we use the default model parameterisation that prioritises real-time performance over run-time or other considerations. For other formulations see the documentation for `estimate_infections()`.
+<details> <summary> Getting Started </summary>
 
-```{r, message = FALSE, warning = FALSE}
-estimates <- epinow(
-  reported_cases = reported_cases,
-  generation_time = generation_time_opts(example_generation_time),
-  delays = delay_opts(example_incubation_period + reporting_delay),
-  rt = rt_opts(prior = list(mean = 2, sd = 0.2)),
-  stan = stan_opts(cores = 4, control = list(adapt_delta = 0.99)),
-  verbose = interactive()
-)
-names(estimates)
-```
+The [Getting Started vignette](EpiNow2.html)
+is your quickest entry point to the package. It provides a quick run through of
+the two main functions in the package and how to set up them up. It also
+discusses how to summarise and visualise the results after running the models.
 
-Both summary measures and posterior samples are returned for all parameters in an easily explored format which can be accessed using `summary`. The default is to return a summary table of estimates for key parameters at the latest date partially supported by data. 
+</details>
 
-```{r}
-knitr::kable(summary(estimates))
-```
+<details> <summary> Package website </summary>
 
-Summarised parameter estimates can also easily be returned, either filtered for a single parameter or for all parameters.
+The [package website](index.html) provides
+various resources for learning about the package, including the function
+reference, details about each model (model definition), workflows for
+each model (usage), and case studies or literature of applications of
+the package.
 
-```{r}
-head(summary(estimates, type = "parameters", params = "R"))
-```
+</details>
 
-Reported cases are returned in a separate data frame in order to streamline the reporting of forecasts and for model evaluation.
+<details> <summary> End-to-end workflows </summary>
 
-```{r}
-head(summary(estimates, output = "estimated_reported_cases"))
-```
-
-A range of plots are returned (with the single summary plot shown below). These plots can also be generated using the following `plot` method.
+The [workflow vignette](articles/estimate_infections_workflow.html) provides guidance on the end-to-end process of estimating reproduction numbers and performing short-term forecasts for a disease spreading in a given setting.
 
-```{r, dpi = 330, fig.width = 12, fig.height = 12, message = FALSE, warning = FALSE}
-plot(estimates)
-```
+</details>
 
+<details> <summary> Model definitions </summary>
 
-### [regional_epinow()](https://epiforecasts.io/EpiNow2/reference/regional_epinow.html)
+On the website, we provide the mathematical definition of each model.
+For example, the model definition vignette for `estimate_infections()` can be found [here](articles/estimate_infections.html).
 
-The `regional_epinow()` function runs the `epinow()` function across multiple regions in
-an efficient manner.
+</details>
 
-Define cases in multiple regions delineated by the region variable.
+<details> <summary> Example implementations </summary>
 
-```{r}
-reported_cases <- data.table::rbindlist(list(
-  data.table::copy(reported_cases)[, region := "testland"],
-  reported_cases[, region := "realland"]
-))
-head(reported_cases)
-```
-
-Calling `regional_epinow()` runs the `epinow()` on each region in turn (or in parallel depending on the settings used). Here we switch to using a weekly random walk rather than the full Gaussian process model giving us piecewise constant estimates by week.
-
-```{r, message = FALSE, warning = FALSE}
-estimates <- regional_epinow(
-  reported_cases = reported_cases,
-  generation_time = generation_time_opts(example_generation_time),
-  delays = delay_opts(example_incubation_period + reporting_delay),
-  rt = rt_opts(prior = list(mean = 2, sd = 0.2), rw = 7),
-  gp = NULL,
-  stan = stan_opts(cores = 4, warmup = 250, samples = 1000)
-)
-```
-
-Results from each region are stored in a `regional` list with across region summary measures and plots stored in a `summary` list. All results can be set to be internally saved by setting the `target_folder` and `summary_dir` arguments. Each region can be estimated in parallel using the `{future}` package (when in most scenarios `cores` should be set to 1). For routine use each MCMC chain can also be run in parallel (with `future` = TRUE) with a time out (`max_execution_time`) allowing for partial results to be returned if a subset of chains is running longer than expected. See the documentation for the `{future}` package for details on nested futures.
-
-Summary measures that are returned include a table formatted for reporting (along with raw results for further processing). Futures updated will extend the S3 methods used above to smooth access to this output.
-
-```{r}
-knitr::kable(estimates$summary$summarised_results$table)
-```
-
-A range of plots are again returned (with the single summary plot shown below).
+A simple example of using the package to estimate a national Rt for Covid-19 can be found [here](https://gist.github.com/seabbs/163d0f195892cde685c70473e1f5e867).
 
-```{r, dpi = 330, fig.width = 12, fig.height = 12, message = FALSE, warning = FALSE}
-estimates$summary$summary_plot
-```
+</details>
 
-### Reporting templates
+<details> <summary> Reporting templates </summary>
 
 Rmarkdown templates are provided in the package (`templates`) for semi-automated reporting of estimates. If using these templates to report your results please highlight our [limitations](https://doi.org/10.12688/wellcomeopenres.16006.1) as these are key to understanding the results from `{EpiNow2}` .
 
+</details>
+
 ## Contributing
 
-File an issue [here](https://github.com/epiforecasts/EpiNow2/issues) if you have identified an issue with the package. Please note that due to operational constraints priority will be given to users informing government policy or offering methodological insights. We welcome all contributions, in particular those that improve the approach or the robustness of the code base. We also welcome additions and extensions to the underlying model either in the form of options or improvements.
+We welcome all contributions. If you have identified an issue with the package,
+you can file an issue [here](https://github.com/epiforecasts/EpiNow2/issues). We also welcome additions and extensions to the underlying model either in the form of options or improvements. If you wish to contribute in any form, please follow the
+[package contributing guide](https://github.com/epiforecasts/EpiNow2/blob/main/.github/CONTRIBUTING.md).