Skip to content

Commit

Permalink
workflow/options vignettes (#458)
Browse files Browse the repository at this point in the history
* new vignettes

* precompile workflow

* set up for workflow

* compiled vignettes

* add news item

* consistent italicisation of EpiNow2

* consistent () for functions

* split out epinow() vignette

* recompile

* update pkdown menu

* update news

* recompile
  • Loading branch information
sbfnk authored Oct 3, 2023
1 parent 89cee00 commit 14718c0
Show file tree
Hide file tree
Showing 23 changed files with 1,661 additions and 1 deletion.
4 changes: 4 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,9 @@
# EpiNow2 1.4.9000

## Documentation

* Two new vignettes have been added to cover the workflow and example uses. By @sbfnk in #458 and reviewed by @jamesmbaazam.

## Package

* Reduced the number of long-running examples.
Expand Down
8 changes: 8 additions & 0 deletions _pkgdown.yml
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,14 @@ navbar:
href: articles/estimate_truncation.html
- text: Gaussian Process implementation details
href: articles/gaussian_process_implementation_details.html
usage:
text: Usage
- text: Workflow for Rt estimation and forecasting
href: articles/estimate_infections_workflow.html
- text: Examples: estimate_infections()
href: articles/estimate_infections_options.html
- text: Using epinow() for running in production mode
href: articles/epinow.html
casestudies:
text: Case studies
menu:
Expand Down
1 change: 0 additions & 1 deletion vignettes/.gitignore
Original file line number Diff line number Diff line change
@@ -1,2 +1 @@
*.html
*.R
177 changes: 177 additions & 0 deletions vignettes/epinow.Rmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,177 @@
---
title: "Using epinow() for running in production mode"
output:
rmarkdown::html_vignette:
toc: false
number_sections: false
bibliography: library.bib
csl: https://raw.githubusercontent.com/citation-style-language/styles/master/apa-numeric-superscript-brackets.csl
vignette: >
%\VignetteIndexEntry{Using epinow() for running in production mode}
%\VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
---



The _EpiNow2_ package contains functionality to run `estimate_infections()` in production mode, i.e. with full logging and saving all relevant outputs and plots to dedicated folders in the hard drive.
This is done with the `epinow()` function, that takes the same options as `estimate_infections()` with some additional options that determine, for example, where output gets stored and what output exactly.
The function can be a useful option when, e.g., running the model daily with updated data on a high-performance computing server to feed into a dashboard.
For more detail on the various model options available, see the [Examples](estimate_infections_options.html) vignette, for more on the general modelling approach the [Workflow](estimate_infections_workflow.html), and for theoretical background the [Model definitions](estimate_infections.html) vignette

# Running the model on a single region

To run the model in production mode for a single region, set the parameters up in the same way as for `estimate_infections()` (see the [Workflow](estimate_infections_workflow.html) vignette).
Here we use the example delay and generation time distributions that come with the package.
This should be replaced with parameters relevant to the system that is being studied.


```r
library("EpiNow2")
options(mc.cores = 4)
reported_cases <- example_confirmed[1:60]
incubation_period <- get_incubation_period(
disease = "SARS-CoV-2", source = "lauer"
)
reporting_delay <- dist_spec(
mean = convert_to_logmean(2, 1), mean_sd = 0,
sd = convert_to_logsd(2, 1), sd_sd = 0, max = 10
)
delay <- incubation_period + reporting_delay
generation_time <- get_generation_time(
disease = "SARS-CoV-2", source = "ganyani"
)
rt_prior <- list(mean = 2, sd = 0.1)
```

We can then run the `epinow()` function with the same arguments as `estimate_infections()`.


```r
res <- epinow(reported_cases,
generation_time = generation_time_opts(generation_time),
delays = delay_opts(delay),
rt = rt_opts(prior = rt_prior),
target_folder = "results"
)
#> Logging threshold set at INFO for the EpiNow2 logger
#> Writing EpiNow2 logs to the console and: /var/folders/gd/x84kkjzd6bn9rlf3f2v830c00000gp/T//Rtmp2g83K0/regional-epinow/2020-04-21.log
#> Logging threshold set at INFO for the EpiNow2.epinow logger
#> Writing EpiNow2.epinow logs to the console and: /var/folders/gd/x84kkjzd6bn9rlf3f2v830c00000gp/T//Rtmp2g83K0/epinow/2020-04-21.log
#> WARN [2023-10-02 20:01:50] epinow: There were 17 divergent transitions after warmup. See
#> https://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup
#> to find out why this is a problem and how to eliminate them. -
#> WARN [2023-10-02 20:01:50] epinow: Examine the pairs() plot to diagnose sampling problems
#> -
res$plots$R
#> NULL
```

The initial messages here indicate where log files can be found, and summarised results and plots are in the folder given by `target_folder` (here: `results/`).

# Running the model simultaneously on multiple regions

The package also contains functionality to conduct inference contemporaneously (if separately) in production mode on multiple time series, e.g. to run the model on multiple regions.
This is done with the `regional_epinow()` function.

Say, for example, we construct a dataset containing two regions, `testland` and `realland` (in this simple example both containing the same case data).


```r
cases <- example_confirmed[1:60]
cases <- data.table::rbindlist(list(
data.table::copy(cases)[, region := "testland"],
cases[, region := "realland"]
))
```

To then run this on multiple regions using the default options above, we could use


```r
region_rt <- regional_epinow(
reported_cases = cases,
generation_time = generation_time_opts(generation_time),
delays = delay_opts(delay),
rt = rt_opts(prior = rt_prior),
)
#> INFO [2023-10-02 20:01:56] Producing following optional outputs: regions, summary, samples, plots, latest
#> Logging threshold set at INFO for the EpiNow2 logger
#> Writing EpiNow2 logs to the console and: /var/folders/gd/x84kkjzd6bn9rlf3f2v830c00000gp/T//Rtmp2g83K0/regional-epinow/2020-04-21.log
#> Logging threshold set at INFO for the EpiNow2.epinow logger
#> Writing EpiNow2.epinow logs to: /var/folders/gd/x84kkjzd6bn9rlf3f2v830c00000gp/T//Rtmp2g83K0/epinow/2020-04-21.log
#> INFO [2023-10-02 20:01:56] Reporting estimates using data up to: 2020-04-21
#> INFO [2023-10-02 20:01:56] No target directory specified so returning output
#> INFO [2023-10-02 20:01:56] Producing estimates for: testland, realland
#> INFO [2023-10-02 20:01:56] Regions excluded: none
#> INFO [2023-10-02 20:03:16] Completed estimates for: testland
#> INFO [2023-10-02 20:04:44] Completed estimates for: realland
#> INFO [2023-10-02 20:04:44] Completed regional estimates
#> INFO [2023-10-02 20:04:44] Regions with estimates: 2
#> INFO [2023-10-02 20:04:44] Regions with runtime errors: 0
#> INFO [2023-10-02 20:04:44] Producing summary
#> INFO [2023-10-02 20:04:44] No summary directory specified so returning summary output
#> INFO [2023-10-02 20:04:45] No target directory specified so returning timings
## summary
region_rt$summary$summarised_results$table
#> Region New confirmed cases by infection date
#> 1: realland 2243 (1148 -- 4405)
#> 2: testland 2327 (1127 -- 4494)
#> Expected change in daily cases Effective reproduction no.
#> 1: Likely decreasing 0.88 (0.62 -- 1.2)
#> 2: Likely decreasing 0.89 (0.61 -- 1.2)
#> Rate of growth Doubling/halving time (days)
#> 1: -0.028 (-0.095 -- 0.037) -25 (19 -- -7.3)
#> 2: -0.025 (-0.098 -- 0.042) -27 (17 -- -7.1)
## plot
region_rt$summary$plots$R
```

![plot of chunk regional_epinow](figure/regional_epinow-1.png)

If instead, we wanted to use the Gaussian Process for `testland` and a weekly random walk for `realland` we could specify these separately using the `opts_list()` and `update_list()` functions


```r
gp <- opts_list(gp_opts(), cases)
gp <- update_list(gp, list(realland = NULL))
rt <- opts_list(rt_opts(), cases, realland = rt_opts(rw = 7))
region_separate_rt <- regional_epinow(
reported_cases = cases,
generation_time = generation_time_opts(generation_time),
delays = delay_opts(delay),
rt = rt, gp = gp,
)
#> INFO [2023-10-02 20:04:45] Producing following optional outputs: regions, summary, samples, plots, latest
#> Logging threshold set at INFO for the EpiNow2 logger
#> Writing EpiNow2 logs to the console and: /var/folders/gd/x84kkjzd6bn9rlf3f2v830c00000gp/T//Rtmp2g83K0/regional-epinow/2020-04-21.log
#> Logging threshold set at INFO for the EpiNow2.epinow logger
#> Writing EpiNow2.epinow logs to: /var/folders/gd/x84kkjzd6bn9rlf3f2v830c00000gp/T//Rtmp2g83K0/epinow/2020-04-21.log
#> INFO [2023-10-02 20:04:45] Reporting estimates using data up to: 2020-04-21
#> INFO [2023-10-02 20:04:45] No target directory specified so returning output
#> INFO [2023-10-02 20:04:45] Producing estimates for: testland, realland
#> INFO [2023-10-02 20:04:45] Regions excluded: none
#> INFO [2023-10-02 20:06:46] Completed estimates for: testland
#> INFO [2023-10-02 20:07:21] Completed estimates for: realland
#> INFO [2023-10-02 20:07:21] Completed regional estimates
#> INFO [2023-10-02 20:07:21] Regions with estimates: 2
#> INFO [2023-10-02 20:07:21] Regions with runtime errors: 0
#> INFO [2023-10-02 20:07:21] Producing summary
#> INFO [2023-10-02 20:07:21] No summary directory specified so returning summary output
#> INFO [2023-10-02 20:07:21] No target directory specified so returning timings
## summary
region_separate_rt$summary$summarised_results$table
#> Region New confirmed cases by infection date
#> 1: realland 2168 (1094 -- 4167)
#> 2: testland 2233 (1014 -- 4477)
#> Expected change in daily cases Effective reproduction no.
#> 1: Likely decreasing 0.86 (0.61 -- 1.2)
#> 2: Likely decreasing 0.88 (0.56 -- 1.2)
#> Rate of growth Doubling/halving time (days)
#> 1: -0.032 (-0.096 -- 0.034) -22 (20 -- -7.2)
#> 2: -0.028 (-0.11 -- 0.044) -25 (16 -- -6.1)
## plot
region_separate_rt$summary$plots$R
```

![plot of chunk regional_epinow_multiple](figure/regional_epinow_multiple-1.png)
113 changes: 113 additions & 0 deletions vignettes/epinow.Rmd.orig
Original file line number Diff line number Diff line change
@@ -0,0 +1,113 @@
---
title: "Using epinow() for running in production mode"
output:
rmarkdown::html_vignette:
toc: false
number_sections: false
bibliography: library.bib
csl: https://raw.githubusercontent.com/citation-style-language/styles/master/apa-numeric-superscript-brackets.csl
vignette: >
%\VignetteIndexEntry{Using epinow() for running in production mode}
%\VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
---

```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.width = 6.5,
fig.height = 6.5
)
```

The _EpiNow2_ package contains functionality to run `estimate_infections()` in production mode, i.e. with full logging and saving all relevant outputs and plots to dedicated folders in the hard drive.
This is done with the `epinow()` function, that takes the same options as `estimate_infections()` with some additional options that determine, for example, where output gets stored and what output exactly.
The function can be a useful option when, e.g., running the model daily with updated data on a high-performance computing server to feed into a dashboard.
For more detail on the various model options available, see the [Examples](estimate_infections_options.html) vignette, for more on the general modelling approach the [Workflow](estimate_infections_workflow.html), and for theoretical background the [Model definitions](estimate_infections.html) vignette

# Running the model on a single region

To run the model in production mode for a single region, set the parameters up in the same way as for `estimate_infections()` (see the [Workflow](estimate_infections_workflow.html) vignette).
Here we use the example delay and generation time distributions that come with the package.
This should be replaced with parameters relevant to the system that is being studied.

```{r setup }
library("EpiNow2")
options(mc.cores = 4)
reported_cases <- example_confirmed[1:60]
incubation_period <- get_incubation_period(
disease = "SARS-CoV-2", source = "lauer"
)
reporting_delay <- dist_spec(
mean = convert_to_logmean(2, 1), mean_sd = 0,
sd = convert_to_logsd(2, 1), sd_sd = 0, max = 10
)
delay <- incubation_period + reporting_delay
generation_time <- get_generation_time(
disease = "SARS-CoV-2", source = "ganyani"
)
rt_prior <- list(mean = 2, sd = 0.1)
```

We can then run the `epinow()` function with the same arguments as `estimate_infections()`.

```{r epinow}
res <- epinow(reported_cases,
generation_time = generation_time_opts(generation_time),
delays = delay_opts(delay),
rt = rt_opts(prior = rt_prior),
target_folder = "results"
)
res$plots$R
```

The initial messages here indicate where log files can be found, and summarised results and plots are in the folder given by `target_folder` (here: `results/`).

# Running the model simultaneously on multiple regions

The package also contains functionality to conduct inference contemporaneously (if separately) in production mode on multiple time series, e.g. to run the model on multiple regions.
This is done with the `regional_epinow()` function.

Say, for example, we construct a dataset containing two regions, `testland` and `realland` (in this simple example both containing the same case data).

```{r construct_regional_cases}
cases <- example_confirmed[1:60]
cases <- data.table::rbindlist(list(
data.table::copy(cases)[, region := "testland"],
cases[, region := "realland"]
))
```

To then run this on multiple regions using the default options above, we could use

```{r regional_epinow, fig.width = 8}
region_rt <- regional_epinow(
reported_cases = cases,
generation_time = generation_time_opts(generation_time),
delays = delay_opts(delay),
rt = rt_opts(prior = rt_prior),
)
## summary
region_rt$summary$summarised_results$table
## plot
region_rt$summary$plots$R
```

If instead, we wanted to use the Gaussian Process for `testland` and a weekly random walk for `realland` we could specify these separately using the `opts_list()` and `update_list()` functions

```{r regional_epinow_multiple, fig.width = 8}
gp <- opts_list(gp_opts(), cases)
gp <- update_list(gp, list(realland = NULL))
rt <- opts_list(rt_opts(), cases, realland = rt_opts(rw = 7))
region_separate_rt <- regional_epinow(
reported_cases = cases,
generation_time = generation_time_opts(generation_time),
delays = delay_opts(delay),
rt = rt, gp = gp,
)
## summary
region_separate_rt$summary$summarised_results$table
## plot
region_separate_rt$summary$plots$R
```
Loading

0 comments on commit 14718c0

Please sign in to comment.