update news

epiforecasts · Oct 3, 2023 · 0d796d7 · 0d796d7
1 parent da970f5
commit 0d796d7
Show file tree

Hide file tree

Showing 5 changed files with 26 additions and 25 deletions.
diff --git a/NEWS.md b/NEWS.md
@@ -2,7 +2,7 @@
 
 ## Documentation
 
-* Two new vignettes have been added to cover the workflow and example uses
+* Two new vignettes have been added to cover the workflow and example uses. By @sbfnk in #458 and reviewed by @jamesmbaazam.
 
 # EpiNow2 1.4.0
 

diff --git a/_pkgdown.yml b/_pkgdown.yml
@@ -46,7 +46,7 @@ navbar:
         href: articles/estimate_infections_workflow.html
       - text: Examples: estimate_infections()
         href: articles/estimate_infections_options.html
-      - text: epinow(): production mode
+      - text: Using epinow() for running in production mode
         href: articles/epinow.html
     casestudies:
       text: Case studies

diff --git a/vignettes/epinow.Rmd.orig b/vignettes/epinow.Rmd.orig
@@ -1,13 +1,13 @@
 ---
-title: "epinow(): production mode"
+title: "Using epinow() for running in production mode"
 output:
   rmarkdown::html_vignette:
     toc: false
     number_sections: false
 bibliography: library.bib
 csl: https://raw.githubusercontent.com/citation-style-language/styles/master/apa-numeric-superscript-brackets.csl
 vignette: >
-  %\VignetteIndexEntry{epinow(): production mode}
+  %\VignetteIndexEntry{Using epinow() for running in production mode}
   %\VignetteEngine{knitr::rmarkdown}
   %\VignetteEncoding{UTF-8}
 ---
@@ -22,13 +22,13 @@ knitr::opts_chunk$set(
 ```
 
 The _EpiNow2_ package contains functionality to run `estimate_infections()` in production mode, i.e. with full logging and saving all relevant outputs and plots to dedicated folders in the hard drive.
-This is done with the `epinow()` function, that takes the same options as `estimate_infections()` with some additional infections that determine, for example, where output gets stored and what output exactly.
+This is done with the `epinow()` function, that takes the same options as `estimate_infections()` with some additional options that determine, for example, where output gets stored and what output exactly.
 The function can be a useful option when, e.g., running the model daily with updated data on a high-performance computing server to feed into a dashboard.
 For more detail on the various model options available, see the [Examples](estimate_infections_options.html) vignette, for more on the general modelling approach the [Workflow](estimate_infections_workflow.html), and for theoretical background the [Model definitions](estimate_infections.html) vignette
 
 # Running the model on a single region
 
-To run the model in production model for a single region, set the parameters up in the same way as for `estimate_infections()` (see the [Workflow](estimate_infections_workflow.html) vignette).
+To run the model in production mode for a single region, set the parameters up in the same way as for `estimate_infections()` (see the [Workflow](estimate_infections_workflow.html) vignette).
 Here we use the example delay and generation time distributions that come with the package.
 This should be replaced with parameters relevant to the system that is being studied.
 
@@ -62,14 +62,14 @@ res <- epinow(reported_cases,
 res$plots$R
 ```
 
-The initial messages here indicate where log files can be fund, and summarised results and plots are in the folder given by `target_folder` (here: `results/`).
+The initial messages here indicate where log files can be found, and summarised results and plots are in the folder given by `target_folder` (here: `results/`).
 
 # Running the model simultaneously on multiple regions
 
 The package also contains functionality to conduct inference contemporaneously (if separately) in production mode on multiple time series, e.g. to run the model on multiple regions.
 This is done with the `regional_epinow()` function.
 
-Say, for example, we construct a data sets containing two regions, `testland` and `realland` (in this simple example both containing the same case data).
+Say, for example, we construct a dataset containing two regions, `testland` and `realland` (in this simple example both containing the same case data).
 
 ```{r construct_regional_cases}
 cases <- example_confirmed[1:60]

diff --git a/vignettes/estimate_infections_options.Rmd.orig b/vignettes/estimate_infections_options.Rmd.orig
@@ -43,7 +43,7 @@ options(mc.cores = 4)
 
 # Data
 
-As data set we will use an example data set that is included in the package, representing an outbreak of COVID-19 with an initial rapid increase followed by decreasing incidence.
+We will use an example data set that is included in the package, representing an outbreak of COVID-19 with an initial rapid increase followed by decreasing incidence.
 
 ```{r data, fig.height = 4}
 library("ggplot2")
@@ -70,8 +70,8 @@ incubation_period <- get_incubation_period(
 incubation_period
 ```
 
-For the reporting delay, we use a lognormal distribution with mean of 2 days and 
-standard deviation of 1 day.
+For the reporting delay, we use a lognormal distribution with mean of 2 days and standard deviation of 1 day.
+Note that the mean and standard deviation must be converted to the log scale, which can be done using the `convert_log_logmean()` function.
 
 ```{r reporting_delay}
 reporting_delay <- dist_spec(
@@ -81,7 +81,7 @@ reporting_delay <- dist_spec(
 reporting_delay
 ```
 
-We can combine these delays into one by summing them up
+_EpiNow2_ provides a feature that allows us to combine these delays into one by summing them up
 
 ```{r delay}
 delay <- incubation_period + reporting_delay

diff --git a/vignettes/estimate_infections_workflow.Rmd.orig b/vignettes/estimate_infections_workflow.Rmd.orig
@@ -20,13 +20,13 @@ knitr::opts_chunk$set(
 )
 ```
 
-In this vignette we describe the typical workflow by which someone might obtain reproduction number estimates and short-term forecasts for a given disease spreading in a given setting.
-The vignette uses the default model included in the package.
+This vignette describes the typical workflow for estimating reproduction numbers and performing short-term forecasts for a disease spreading in a given setting using _EpiNow2_.
+The vignette uses the default non-stationary Gaussian process model included in the package.
 See other vignettes for a more thorough exploration of [alternative model variants](estimate_infections_options.html) and [theoretical background](estimate_infections.html).
 
 # Data
 
-Obtaining a good and full understanding of the data being used an important first step in any inference procedure such as the one applied here. 
+Obtaining a good and full understanding of the data being used is an important first step in any inference procedure such as the one applied here. 
 _EpiNow2_ expects data in the format of a data frame with two columns, `date` and `confirm`, where `confirm` stands for the number of confirmed counts - although in reality this can be applied to any data including suspected cases and lab-confirmed outcomes.
 The user might already have the data as such a time series provided, for example, on public dashboards or directly from public health authorities.
 Alternatively, they can be constructed from individual-level data, for example using the [incidence2](https://cran.r-project.org/web/packages/incidence2/index.html) R package.
@@ -49,12 +49,14 @@ We first load the _EpiNow2_ package.
 library("EpiNow2")
 ```
 
-We then set the number of cores to use. We will want to run 4 MCMC chains in parallel so we set this to 4. If we had fewer than 4 available or wanted to run fewer than 4 chains (at the expense of some robustness), or had fewer than 4 computing cores available we could set it to that. To find out the number of cores available one can use the [detectCores](https://rdrr.io/r/parallel/detectCores.html) function from the `parallel` package.
+We then set the number of cores to use. We will want to run 4 MCMC chains in parallel so we set this to 4. 
 
 ```{r}
 options(mc.cores = 4)
 ```
 
+If we had fewer than 4 available or wanted to run fewer than 4 chains (at the expense of some robustness), or had fewer than 4 computing cores available we could set it to that. To find out the number of cores available one can use the [detectCores](https://rdrr.io/r/parallel/detectCores.html) function from the `parallel` package.
+
 # Parameters
 
 Once a data set has been identified, a number of relevant parameters need to be considered before using _EpiNow2_.
@@ -67,7 +69,7 @@ They are defined using a common interface with the `dist_spec()` function.
 For help with this function, see its manual page
 
 ```{r eval = FALSE}
-?dist_spec
+?EpiNow2::dist_spec
 ```
 
 In all cases, the distributions given can be *fixed* (i.e. have no uncertainty) or *variable* (i.e. have associated uncertainty).
@@ -88,15 +90,15 @@ dist_spec(
 
 There are various ways the specific delay distributions mentioned below might be obtained.
 Often, they will come directly from the existing literature reviewed by the user and studies conducted elsewhere.
-Sometimes it might be possible to obatined them from existing databases, e.g. using the [epiparameter](https://github.com/epiverse-trace/epiparameter) R package.
+Sometimes it might be possible to obtain them from existing databases, e.g. using the [epiparameter](https://github.com/epiverse-trace/epiparameter) R package.
 Alternatively they might be obtainable from raw data, e.g. linelists.
 The _EpiNow2_ package contains functionality for estimating delay distributions from observed delays in the `estimate_delays()` function.
 For a more comprehensive treatment of delays and their estimation avoiding common biases one can consider, for example, the [dynamicaltruncation](https://github.com/parksw3/epidist-paper) R package and associated paper.
 
 ### Generation intervals
 
 The generation interval is a delay distribution that describes the amount of time that passes between an individual becoming infected and infecting someone else.
-In _EpiNow2_, the generation time distribution is defined by a call to `generation_time_opts()`, a function that takes a single argument defined as a `dist_spec`.
+In _EpiNow2_, the generation time distribution is defined by a call to `generation_time_opts()`, a function that takes a single argument defined as a `dist_spec` object (returned by `dist_spec()`).
 For example, to define the generation time as gamma distributed with uncertain mean centered on 3 (sd: 2) and sd centered on 1 (sd: 0.1), a maximum value of 10 and weighted by the number of case data points we would use
 
 ```{r, results = 'hide'}
@@ -126,7 +128,7 @@ reporting_delay <- dist_spec(
 incubation_period + reporting_delay
 ```
 
-In _EpiNow2_, the reporting delay distribution is defined by a call to `delay_opts()`, a function that takes a single argument defined as a `dist_spec`.
+In _EpiNow2_, the reporting delay distribution is defined by a call to `delay_opts()`, a function that takes a single argument defined as a `dist_spec` object (returned by `dist_spec()`).
 For example, if our observations were by symptom onset we would use
 
 ```{r, results = 'hide'}
@@ -143,7 +145,7 @@ delay_opts(delay)
 ### Truncation
 
 Besides the delay from infection to the event that is recorded in the data, there can also be a delay from that event to being recorded in the data.
-For example, data by symptom onset may only enter the data once lab confirmation has occurred, or even a day or two after that confirmation.
+For example, data reported by symptom onset may only become part of the dataset once lab confirmation has occurred, or even a day or two after that confirmation.
 Statistically, this means our data is right-truncated.
 In practice, it means that recent data will be unlikely to be complete.
 
@@ -152,7 +154,6 @@ One can then use methods that use the amount of backfilling that occurred 1, 2,
 In _EpiNow2_, this can be done using the `estimate_truncation()` method which returns, amongst others, posterior estimates of the truncation distribution.
 For more details on the model used for this, see the [estimate_truncation](estimate_truncation.html) vignette.
 
-
 ```{r eval = FALSE}
 ?estimate_truncation
 ```
@@ -161,7 +162,7 @@ In the `estimate_infections()` function, the truncation distribution is defined
 This will then be used to correct for right truncation in the data.
 
 The separation of estimation of right truncation on the one hand and estimation of the reproduction number on the other may be attractive for practical purposes but is questionable statistically as it separates two processes that are not strictly separable, potentially introducing a bias.
-For an alternative approach where these are estimated jointly that is being developed by some of the package authors of _EpiNow2_ with collaborators, see the [epinowcast](https://package.epinowcast.org/) package.
+An alternative approach where these are estimated jointly is being implemented in the [epinowcast](https://package.epinowcast.org/) package, which is being developed by the _EpiNow2_ developers with collaborators.
 
 ## Completeness of reporting
 
@@ -181,7 +182,7 @@ The default model that `estimate_infections()` uses to estimate reproduction num
 This represents the user's initial belief of the value of the reproduction number, where there is no data yet to inform its value.
 By default this is assumed to be represented by a lognormal distribution with mean and standard deviation of 1.
 It can be changed using the `rt_opts()` function.
-For example, if the users believes that at the very start of the data the reproduction number was 2, with uncertainty in this belief represented by a standard deviation of 1, they would used
+For example, if the user believes that at the very start of the data the reproduction number was 2, with uncertainty in this belief represented by a standard deviation of 1, they would use
 
 ```{r results = 'hide'}
 rt_prior <- list(mean =  2, sd = 1)
@@ -208,7 +209,7 @@ def <- estimate_infections(
 )
 ```
 
-Alternatively, for production environments the `epinow` function can be used that uses `estimate_infections` internally but adds functionality for logging and saving results and plots in dedicated places in the user's file system.
+Alternatively, for production environments, we recommend using the `epinow()` function. It uses `estimate_infections()` internally and provides functionality for logging and saving results and plots in dedicated directories in the user's file system.
 
 ## Forecasting secondary outcomes