workflow/options vignettes (#458)

* new vignettes * precompile workflow * set up for workflow * compiled vignettes * add news item * consistent italicisation of EpiNow2 * consistent () for functions * split out epinow() vignette * recompile * update pkdown menu * update news * recompile
epiforecasts · Oct 3, 2023 · 14718c0 · 14718c0
1 parent 89cee00
commit 14718c0
Show file tree

Hide file tree

Showing 23 changed files with 1,661 additions and 1 deletion.
diff --git a/NEWS.md b/NEWS.md
@@ -1,5 +1,9 @@
 # EpiNow2 1.4.9000
 
+## Documentation
+
+* Two new vignettes have been added to cover the workflow and example uses. By @sbfnk in #458 and reviewed by @jamesmbaazam.
+
 ## Package
 
 * Reduced the number of long-running examples.

diff --git a/_pkgdown.yml b/_pkgdown.yml
@@ -40,6 +40,14 @@ navbar:
         href: articles/estimate_truncation.html
       - text: Gaussian Process implementation details
         href: articles/gaussian_process_implementation_details.html
+    usage:
+      text: Usage
+      - text: Workflow for Rt estimation and forecasting
+        href: articles/estimate_infections_workflow.html
+      - text: Examples: estimate_infections()
+        href: articles/estimate_infections_options.html
+      - text: Using epinow() for running in production mode
+        href: articles/epinow.html
     casestudies:
       text: Case studies
       menu:

diff --git a/vignettes/.gitignore b/vignettes/.gitignore
@@ -1,2 +1 @@
 *.html
-*.R
diff --git a/vignettes/epinow.Rmd b/vignettes/epinow.Rmd
@@ -0,0 +1,177 @@
+---
+title: "Using epinow() for running in production mode"
+output:
+  rmarkdown::html_vignette:
+    toc: false
+    number_sections: false
+bibliography: library.bib
+csl: https://raw.githubusercontent.com/citation-style-language/styles/master/apa-numeric-superscript-brackets.csl
+vignette: >
+  %\VignetteIndexEntry{Using epinow() for running in production mode}
+  %\VignetteEngine{knitr::rmarkdown}
+  %\VignetteEncoding{UTF-8}
+---
+
+
+
+The _EpiNow2_ package contains functionality to run `estimate_infections()` in production mode, i.e. with full logging and saving all relevant outputs and plots to dedicated folders in the hard drive.
+This is done with the `epinow()` function, that takes the same options as `estimate_infections()` with some additional options that determine, for example, where output gets stored and what output exactly.
+The function can be a useful option when, e.g., running the model daily with updated data on a high-performance computing server to feed into a dashboard.
+For more detail on the various model options available, see the [Examples](estimate_infections_options.html) vignette, for more on the general modelling approach the [Workflow](estimate_infections_workflow.html), and for theoretical background the [Model definitions](estimate_infections.html) vignette
+
+# Running the model on a single region
+
+To run the model in production mode for a single region, set the parameters up in the same way as for `estimate_infections()` (see the [Workflow](estimate_infections_workflow.html) vignette).
+Here we use the example delay and generation time distributions that come with the package.
+This should be replaced with parameters relevant to the system that is being studied.
+
+
+```r
+library("EpiNow2")
+options(mc.cores = 4)
+reported_cases <- example_confirmed[1:60]
+incubation_period <- get_incubation_period(
+  disease = "SARS-CoV-2", source = "lauer"
+)
+reporting_delay <- dist_spec(
+  mean = convert_to_logmean(2, 1), mean_sd = 0,
+  sd = convert_to_logsd(2, 1), sd_sd = 0, max = 10
+)
+delay <- incubation_period + reporting_delay
+generation_time <- get_generation_time(
+  disease = "SARS-CoV-2", source = "ganyani"
+)
+rt_prior <- list(mean = 2, sd = 0.1)
+```
+
+We can then run the `epinow()` function with the same arguments as `estimate_infections()`.
+
+
+```r
+res <- epinow(reported_cases,
+  generation_time = generation_time_opts(generation_time),
+  delays = delay_opts(delay),
+  rt = rt_opts(prior = rt_prior),
+  target_folder = "results"
+)
+#> Logging threshold set at INFO for the EpiNow2 logger
+#> Writing EpiNow2 logs to the console and: /var/folders/gd/x84kkjzd6bn9rlf3f2v830c00000gp/T//Rtmp2g83K0/regional-epinow/2020-04-21.log
+#> Logging threshold set at INFO for the EpiNow2.epinow logger
+#> Writing EpiNow2.epinow logs to the console and: /var/folders/gd/x84kkjzd6bn9rlf3f2v830c00000gp/T//Rtmp2g83K0/epinow/2020-04-21.log
+#> WARN [2023-10-02 20:01:50] epinow: There were 17 divergent transitions after warmup. See
+#> https://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup
+#> to find out why this is a problem and how to eliminate them. - 
+#> WARN [2023-10-02 20:01:50] epinow: Examine the pairs() plot to diagnose sampling problems
+#>  -
+res$plots$R
+#> NULL
+```
+
+The initial messages here indicate where log files can be found, and summarised results and plots are in the folder given by `target_folder` (here: `results/`).
+
+# Running the model simultaneously on multiple regions
+
+The package also contains functionality to conduct inference contemporaneously (if separately) in production mode on multiple time series, e.g. to run the model on multiple regions.
+This is done with the `regional_epinow()` function.
+
+Say, for example, we construct a dataset containing two regions, `testland` and `realland` (in this simple example both containing the same case data).
+
+
+```r
+cases <- example_confirmed[1:60]
+cases <- data.table::rbindlist(list(
+  data.table::copy(cases)[, region := "testland"],
+  cases[, region := "realland"]
+ ))
+```
+
+To then run this on multiple regions using the default options above, we could use
+
+
+```r
+region_rt <- regional_epinow(
+  reported_cases = cases,
+  generation_time = generation_time_opts(generation_time),
+  delays = delay_opts(delay),
+  rt = rt_opts(prior = rt_prior),
+)
+#> INFO [2023-10-02 20:01:56] Producing following optional outputs: regions, summary, samples, plots, latest
+#> Logging threshold set at INFO for the EpiNow2 logger
+#> Writing EpiNow2 logs to the console and: /var/folders/gd/x84kkjzd6bn9rlf3f2v830c00000gp/T//Rtmp2g83K0/regional-epinow/2020-04-21.log
+#> Logging threshold set at INFO for the EpiNow2.epinow logger
+#> Writing EpiNow2.epinow logs to: /var/folders/gd/x84kkjzd6bn9rlf3f2v830c00000gp/T//Rtmp2g83K0/epinow/2020-04-21.log
+#> INFO [2023-10-02 20:01:56] Reporting estimates using data up to: 2020-04-21
+#> INFO [2023-10-02 20:01:56] No target directory specified so returning output
+#> INFO [2023-10-02 20:01:56] Producing estimates for: testland, realland
+#> INFO [2023-10-02 20:01:56] Regions excluded: none
+#> INFO [2023-10-02 20:03:16] Completed estimates for: testland
+#> INFO [2023-10-02 20:04:44] Completed estimates for: realland
+#> INFO [2023-10-02 20:04:44] Completed regional estimates
+#> INFO [2023-10-02 20:04:44] Regions with estimates: 2
+#> INFO [2023-10-02 20:04:44] Regions with runtime errors: 0
+#> INFO [2023-10-02 20:04:44] Producing summary
+#> INFO [2023-10-02 20:04:44] No summary directory specified so returning summary output
+#> INFO [2023-10-02 20:04:45] No target directory specified so returning timings
+## summary
+region_rt$summary$summarised_results$table
+#>      Region New confirmed cases by infection date
+#> 1: realland                   2243 (1148 -- 4405)
+#> 2: testland                   2327 (1127 -- 4494)
+#>    Expected change in daily cases Effective reproduction no.
+#> 1:              Likely decreasing         0.88 (0.62 -- 1.2)
+#> 2:              Likely decreasing         0.89 (0.61 -- 1.2)
+#>              Rate of growth Doubling/halving time (days)
+#> 1: -0.028 (-0.095 -- 0.037)             -25 (19 -- -7.3)
+#> 2: -0.025 (-0.098 -- 0.042)             -27 (17 -- -7.1)
+## plot
+region_rt$summary$plots$R
+```
+
+![plot of chunk regional_epinow](figure/regional_epinow-1.png)
+
+If instead, we wanted to use the Gaussian Process for `testland` and a weekly random walk for `realland` we could specify these separately using the `opts_list()` and `update_list()` functions
+
+
+```r
+gp <- opts_list(gp_opts(), cases)
+gp <- update_list(gp, list(realland = NULL))
+rt <- opts_list(rt_opts(), cases, realland = rt_opts(rw = 7))
+region_separate_rt <- regional_epinow(
+  reported_cases = cases,
+  generation_time = generation_time_opts(generation_time),
+  delays = delay_opts(delay),
+  rt = rt, gp = gp,
+)
+#> INFO [2023-10-02 20:04:45] Producing following optional outputs: regions, summary, samples, plots, latest
+#> Logging threshold set at INFO for the EpiNow2 logger
+#> Writing EpiNow2 logs to the console and: /var/folders/gd/x84kkjzd6bn9rlf3f2v830c00000gp/T//Rtmp2g83K0/regional-epinow/2020-04-21.log
+#> Logging threshold set at INFO for the EpiNow2.epinow logger
+#> Writing EpiNow2.epinow logs to: /var/folders/gd/x84kkjzd6bn9rlf3f2v830c00000gp/T//Rtmp2g83K0/epinow/2020-04-21.log
+#> INFO [2023-10-02 20:04:45] Reporting estimates using data up to: 2020-04-21
+#> INFO [2023-10-02 20:04:45] No target directory specified so returning output
+#> INFO [2023-10-02 20:04:45] Producing estimates for: testland, realland
+#> INFO [2023-10-02 20:04:45] Regions excluded: none
+#> INFO [2023-10-02 20:06:46] Completed estimates for: testland
+#> INFO [2023-10-02 20:07:21] Completed estimates for: realland
+#> INFO [2023-10-02 20:07:21] Completed regional estimates
+#> INFO [2023-10-02 20:07:21] Regions with estimates: 2
+#> INFO [2023-10-02 20:07:21] Regions with runtime errors: 0
+#> INFO [2023-10-02 20:07:21] Producing summary
+#> INFO [2023-10-02 20:07:21] No summary directory specified so returning summary output
+#> INFO [2023-10-02 20:07:21] No target directory specified so returning timings
+## summary
+region_separate_rt$summary$summarised_results$table
+#>      Region New confirmed cases by infection date
+#> 1: realland                   2168 (1094 -- 4167)
+#> 2: testland                   2233 (1014 -- 4477)
+#>    Expected change in daily cases Effective reproduction no.
+#> 1:              Likely decreasing         0.86 (0.61 -- 1.2)
+#> 2:              Likely decreasing         0.88 (0.56 -- 1.2)
+#>              Rate of growth Doubling/halving time (days)
+#> 1: -0.032 (-0.096 -- 0.034)             -22 (20 -- -7.2)
+#> 2:  -0.028 (-0.11 -- 0.044)             -25 (16 -- -6.1)
+## plot
+region_separate_rt$summary$plots$R
+```
+
+![plot of chunk regional_epinow_multiple](figure/regional_epinow_multiple-1.png)
diff --git a/vignettes/epinow.Rmd.orig b/vignettes/epinow.Rmd.orig
@@ -0,0 +1,113 @@
+---
+title: "Using epinow() for running in production mode"
+output:
+  rmarkdown::html_vignette:
+    toc: false
+    number_sections: false
+bibliography: library.bib
+csl: https://raw.githubusercontent.com/citation-style-language/styles/master/apa-numeric-superscript-brackets.csl
+vignette: >
+  %\VignetteIndexEntry{Using epinow() for running in production mode}
+  %\VignetteEngine{knitr::rmarkdown}
+  %\VignetteEncoding{UTF-8}
+---
+
+```{r, include = FALSE}
+knitr::opts_chunk$set(
+  collapse = TRUE,
+  comment = "#>",
+  fig.width = 6.5,
+  fig.height = 6.5
+)
+```
+
+The _EpiNow2_ package contains functionality to run `estimate_infections()` in production mode, i.e. with full logging and saving all relevant outputs and plots to dedicated folders in the hard drive.
+This is done with the `epinow()` function, that takes the same options as `estimate_infections()` with some additional options that determine, for example, where output gets stored and what output exactly.
+The function can be a useful option when, e.g., running the model daily with updated data on a high-performance computing server to feed into a dashboard.
+For more detail on the various model options available, see the [Examples](estimate_infections_options.html) vignette, for more on the general modelling approach the [Workflow](estimate_infections_workflow.html), and for theoretical background the [Model definitions](estimate_infections.html) vignette
+
+# Running the model on a single region
+
+To run the model in production mode for a single region, set the parameters up in the same way as for `estimate_infections()` (see the [Workflow](estimate_infections_workflow.html) vignette).
+Here we use the example delay and generation time distributions that come with the package.
+This should be replaced with parameters relevant to the system that is being studied.
+
+```{r setup }
+library("EpiNow2")
+options(mc.cores = 4)
+reported_cases <- example_confirmed[1:60]
+incubation_period <- get_incubation_period(
+  disease = "SARS-CoV-2", source = "lauer"
+)
+reporting_delay <- dist_spec(
+  mean = convert_to_logmean(2, 1), mean_sd = 0,
+  sd = convert_to_logsd(2, 1), sd_sd = 0, max = 10
+)
+delay <- incubation_period + reporting_delay
+generation_time <- get_generation_time(
+  disease = "SARS-CoV-2", source = "ganyani"
+)
+rt_prior <- list(mean = 2, sd = 0.1)
+```
+
+We can then run the `epinow()` function with the same arguments as `estimate_infections()`.
+
+```{r epinow}
+res <- epinow(reported_cases,
+  generation_time = generation_time_opts(generation_time),
+  delays = delay_opts(delay),
+  rt = rt_opts(prior = rt_prior),
+  target_folder = "results"
+)
+res$plots$R
+```
+
+The initial messages here indicate where log files can be found, and summarised results and plots are in the folder given by `target_folder` (here: `results/`).
+
+# Running the model simultaneously on multiple regions
+
+The package also contains functionality to conduct inference contemporaneously (if separately) in production mode on multiple time series, e.g. to run the model on multiple regions.
+This is done with the `regional_epinow()` function.
+
+Say, for example, we construct a dataset containing two regions, `testland` and `realland` (in this simple example both containing the same case data).
+
+```{r construct_regional_cases}
+cases <- example_confirmed[1:60]
+cases <- data.table::rbindlist(list(
+  data.table::copy(cases)[, region := "testland"],
+  cases[, region := "realland"]
+ ))
+```
+
+To then run this on multiple regions using the default options above, we could use
+
+```{r regional_epinow, fig.width = 8}
+region_rt <- regional_epinow(
+  reported_cases = cases,
+  generation_time = generation_time_opts(generation_time),
+  delays = delay_opts(delay),
+  rt = rt_opts(prior = rt_prior),
+)
+## summary
+region_rt$summary$summarised_results$table
+## plot
+region_rt$summary$plots$R
+```
+
+If instead, we wanted to use the Gaussian Process for `testland` and a weekly random walk for `realland` we could specify these separately using the `opts_list()` and `update_list()` functions
+
+```{r regional_epinow_multiple, fig.width = 8}
+gp <- opts_list(gp_opts(), cases)
+gp <- update_list(gp, list(realland = NULL))
+rt <- opts_list(rt_opts(), cases, realland = rt_opts(rw = 7))
+region_separate_rt <- regional_epinow(
+  reported_cases = cases,
+  generation_time = generation_time_opts(generation_time),
+  delays = delay_opts(delay),
+  rt = rt, gp = gp,
+)
+## summary
+region_separate_rt$summary$summarised_results$table
+## plot
+region_separate_rt$summary$plots$R
+```