Skip to content

Commit

Permalink
JOSS review: sinusoid examples and plots in paper (#298)
Browse files Browse the repository at this point in the history
* Paper and docs Fig. 1 comparing true and observed signals: changed 6.99 -> 7.0 in legend; edited blue vertical line with arrows for observation range so it is now centered on the observed mean following reviewer comments.

* Paper and docs for contour plots showing how sinusoid outputs vary with respect to parameters, updated color limits to be fixed for all contour plots for easier comparisons

* Docs for the sampling using Random Features: add a table with mean and std of parameter values to directly compare values to GP results.

* Paper, updated sentence comparing ABC and CES to avoid making too strong a claim. Instead, just comment on some limiations of ABC and CES. Also include a reference to the the high dimensional chapter of the ABC Handbook.
  • Loading branch information
lm2612 authored Apr 4, 2024
1 parent fb4fa50 commit 2827e71
Show file tree
Hide file tree
Showing 15 changed files with 55 additions and 9 deletions.
Binary file modified docs/src/assets/sinusoid_GP_emulator_contours.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/src/assets/sinusoid_GP_errors_contours.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/src/assets/sinusoid_RF_emulator_contours.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/src/assets/sinusoid_RF_errors_contours.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/src/assets/sinusoid_groundtruth_contours.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/src/assets/sinusoid_true_vs_observed_signal.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
13 changes: 9 additions & 4 deletions docs/src/examples/sinusoid_example.md
Original file line number Diff line number Diff line change
Expand Up @@ -402,10 +402,10 @@ println("Vertical shift mean: ", mean(constrained_posterior["vert_shift"]), ", s
This gives:


| parameters | mean | std |
| :---------------- | :------: | :----: |
| amplitude | 3.0382 | 0.2861 |
| vert_shift | 6.3897 | 0.4601 |
| parameters | mean | std |
| :---------------- | :------: | :----: |
| amplitude | 3.0382 | 0.2880 |
| vert_shift | 6.3774 | 0.4586 |

This is in agreement with the true $\theta=(3.0, 7.0)$ and with the observational covariance matrix we provided $\Gamma=0.2 * I$ (i.e., a standard deviation of approx. $0.45$). `CalibrateEmulateSample.jl` has built-in plotting
recipes to help us visualize the prior and posterior distributions. Note that these are the
Expand Down Expand Up @@ -433,6 +433,11 @@ $\theta_2$ below, with the marginal distributions on each axis.
We can repeat the sampling method using the random features emulator instead of the Gaussian
process and we find similar results:

| parameters | mean | std |
| :---------------- | :------: | :----: |
| amplitude | 3.3210 | 0.7216 |
| vert_shift | 6.3986 | 0.5098 |

![RF_2d_posterior](../assets/sinusoid_MCMC_hist_RF.png)

It is reassuring to see that our uncertainty quantification methods are robust to the different emulator
Expand Down
22 changes: 22 additions & 0 deletions examples/Sinusoid/emulate.jl
Original file line number Diff line number Diff line change
Expand Up @@ -156,11 +156,15 @@ rf_std_grid = reshape(permutedims(reduce(vcat, [x' for x in rf_std]), (2, 1)), (
## Plot
# First, we will plot the ground truth. We have 2 parameters and 2 outputs, so we will create a contour plot
# for each output to show how they vary against the two inputs.
range_clims = (0, 8)
mean_clims = (-6, 10)

p1 = contour(
amp_range,
vshift_range,
g_true_grid[1, :, :];
fill = true,
clims = range_clims,
xlabel = "Amplitude",
ylabel = "Vertical Shift",
title = "True Sinusoid Range",
Expand All @@ -170,6 +174,7 @@ p2 = contour(
vshift_range,
g_true_grid[2, :, :];
fill = true,
clims = mean_clims,
xlabel = "Amplitude",
ylabel = "Vertical Shift",
title = "True Sinusoid Mean",
Expand All @@ -188,6 +193,7 @@ p1 = contour(
vshift_range,
gp_grid[1, :, :];
fill = true,
clims = range_clims,
xlabel = "Amplitude",
ylabel = "Vertical Shift",
title = "GP Sinusoid Range",
Expand All @@ -200,6 +206,7 @@ p2 = contour(
vshift_range,
gp_grid[2, :, :];
fill = true,
clims = mean_clims,
xlabel = "Amplitude",
ylabel = "Vertical Shift",
title = "GP Sinusoid Mean",
Expand All @@ -214,6 +221,7 @@ p1 = contour(
vshift_range,
rf_grid[1, :, :];
fill = true,
clims = range_clims,
xlabel = "Amplitude",
ylabel = "Vertical Shift",
title = "RF Sinusoid Range",
Expand All @@ -224,6 +232,7 @@ p2 = contour(
vshift_range,
rf_grid[2, :, :];
fill = true,
clims = mean_clims,
xlabel = "Amplitude",
ylabel = "Vertical Shift",
title = "RF Sinusoid Mean",
Expand All @@ -238,12 +247,16 @@ savefig(p, joinpath(data_save_directory, "sinusoid_RF_emulator_contours.png"))

# Next, we plot uncertainty estimates from the GP and RF emulators
# Plot GP std estimates
range_std_clims = (0, 2)
mean_std_clims = (0, 1)

p1 = contour(
amp_range,
vshift_range,
gp_std_grid[1, :, :];
c = :cividis,
fill = true,
clims = range_std_clims,
xlabel = "Amplitude",
ylabel = "Vertical Shift",
title = "GP 1σ in Sinusoid Range",
Expand All @@ -254,6 +267,7 @@ p2 = contour(
gp_std_grid[2, :, :];
c = :cividis,
fill = true,
clims = mean_std_clims,
xlabel = "Amplitude",
ylabel = "Vertical Shift",
title = "GP 1σ in Sinusoid Mean",
Expand All @@ -269,6 +283,7 @@ p1 = contour(
rf_std_grid[1, :, :];
c = :cividis,
fill = true,
clims = range_std_clims,
xlabel = "Amplitude",
ylabel = "Vertical Shift",
title = "RF 1σ in Sinusoid Range",
Expand All @@ -279,6 +294,7 @@ p2 = contour(
rf_std_grid[2, :, :];
c = :cividis,
fill = true,
clims = mean_std_clims,
xlabel = "Amplitude",
ylabel = "Vertical Shift",
title = "RF 1σ in Sinusoid Mean",
Expand All @@ -291,12 +307,15 @@ savefig(p, joinpath(data_save_directory, "sinusoid_RF_emulator_std_contours.png"
# Finally, we should validate how accurate the emulators are by looking at the absolute difference between emulator
# predictions and the ground truth.
gp_diff_grid = abs.(gp_grid - g_true_grid)
range_diff_clims = (0, 1)
mean_diff_clims = (0, 1)
p1 = contour(
amp_range,
vshift_range,
gp_diff_grid[1, :, :];
c = :cividis,
fill = true,
clims = range_diff_clims,
xlabel = "Amplitude",
ylabel = "Vertical Shift",
title = "GP error in Sinusoid Range",
Expand All @@ -307,6 +326,7 @@ p2 = contour(
gp_diff_grid[2, :, :];
c = :cividis,
fill = true,
clims = mean_diff_clims,
xlabel = "Amplitude",
ylabel = "Vertical Shift",
title = "GP error in Sinusoid Mean",
Expand All @@ -321,6 +341,7 @@ p1 = contour(
rf_diff_grid[1, :, :];
c = :cividis,
fill = true,
clims = range_diff_clims,
xlabel = "Amplitude",
ylabel = "Vertical Shift",
title = "RF error in Sinusoid Range",
Expand All @@ -331,6 +352,7 @@ p2 = contour(
rf_diff_grid[2, :, :];
c = :cividis,
fill = true,
clims = mean_diff_clims,
xlabel = "Amplitude",
ylabel = "Vertical Shift",
title = "RF error in Sinusoid Mean",
Expand Down
8 changes: 4 additions & 4 deletions examples/Sinusoid/sinusoid_setup.jl
Original file line number Diff line number Diff line change
Expand Up @@ -88,15 +88,15 @@ function generate_obs(amplitude_true, vert_shift_true, Γ; rng = Random.GLOBAL_R
color = :red,
style = :dash,
linewidth = 2,
label = "True mean: " * string(round(y2_true, digits = 2)),
label = "True mean: " * string(round(y2_true, digits = 1)),
)
plot!(
[argmax(signal_true), argmax(signal_true)],
[minimum(signal_true), maximum(signal_true)],
arrows = :both,
color = :red,
linewidth = 2,
label = "True range: " * string(round(y1_true, digits = 2)),
label = "True range: " * string(round(y1_true, digits = 1)),
)

# However, our observations are typically not noise-free, so we add some white noise to our
Expand All @@ -115,8 +115,8 @@ function generate_obs(amplitude_true, vert_shift_true, Γ; rng = Random.GLOBAL_R
label = "Observed mean: " * string(round(y2_obs, digits = 2)),
)
plot!(
[argmax(signal_true) + 10, argmax(signal_true) + 10],
[y2_true - y1_obs / 2, y2_true + y1_obs / 2],
[argmax(signal_true) + 15, argmax(signal_true) + 15],
[y2_obs - y1_obs / 2, y2_obs + y1_obs / 2],
arrows = :both,
color = :blue,
linewidth = 1,
Expand Down
19 changes: 19 additions & 0 deletions paper.bib
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,25 @@ @article{julia
journal = {{SIAM} Review}
}

@book{Sisson:2018,
title = {Handbook of {Approximate} {Bayesian} {Computation}},
isbn = {978-1-351-64346-7},
publisher = {CRC Press},
author = {Sisson, Scott A. and Fan, Yanan and Beaumont, Mark},
year = {2018}
}

@incollection{Nott:2018,
author = {Nott, David J. and Ong, Victor M.-H. and Fan, Y. and Sisson, S. A.},
title = {High-{Dimensional} {ABC}},
booktitle = {Handbook of {Approximate} {Bayesian} {Computation}},
isbn = {978-1-315-11719-5},
chapter = {8},
pages = {211-241},
year = {2018},
publisher = {CRC Press},
}

@article{Cleary:2021,
title = {Calibrate, emulate, sample},
journal = {Journal of Computational Physics},
Expand Down
2 changes: 1 addition & 1 deletion paper.md
Original file line number Diff line number Diff line change
Expand Up @@ -67,7 +67,7 @@ Computationally expensive computer codes for predictive modelling are ubiquitous

In Julia there are a few tools for performing non-accelerated uncertainty quantification, from classical sensitivity analysis approaches, e.g., [UncertaintyQuantification.jl](https://zenodo.org/records/10149017), GlobalSensitivity.jl [@Dixit:2022], and MCMC, e.g., [Mamba.jl](https://github.com/brian-j-smith/Mamba.jl) or [Turing.jl](https://turinglang.org/). For computational efficiency, ensemble methods also provide approximate sampling (e.g., the Ensemble Kalman Sampler [@Garbuno-Inigo:2020b;@Dunbar:2022a]) though these only provide Gaussian approximations of the posterior.

Accelerated uncertainty quantification tools also exist for the related approach of Approximate Bayesian Computation (ABC), e.g., GpABC [@Tankhilevich:2020] or [ApproxBayes.jl](https://github.com/marcjwilliams1/ApproxBayes.jl?tab=readme-ov-file); these tools both approximately sample from the posterior distribution. In ABC, this approximation comes from bypassing the likelihood that is usually required in sampling methods, such as MCMC. Instead, the goal of ABC is to replace the likelihood with a scalar-valued sampling objective that compares model and data. In CES, the approximation comes from learning the parameter-to-data map, then following this it calculates an explicit likelihood and uses exact sampling via MCMC. Some ABC algorithms also make use of statistical emulators to further accelerate sampling (GpABC). ABC can be used in more general contexts than CES, but suffers greater approximation error and more stringent assumptions, especially in multi-dimensional problems.
Accelerated uncertainty quantification tools also exist for the related approach of Approximate Bayesian Computation (ABC), e.g., GpABC [@Tankhilevich:2020] or [ApproxBayes.jl](https://github.com/marcjwilliams1/ApproxBayes.jl?tab=readme-ov-file); these tools both approximately sample from the posterior distribution. In ABC, this approximation comes from bypassing the likelihood that is usually required in sampling methods, such as MCMC. Instead, the goal of ABC is to replace the likelihood with a scalar-valued sampling objective that compares model and data. In CES, the approximation comes from learning the parameter-to-data map, then following this it calculates an explicit likelihood and uses exact sampling via MCMC. Some ABC algorithms also make use of statistical emulators to further accelerate sampling (GpABC). Although flexible, ABC encounters challenges due to the subjectivity of summary statistics and distance metrics, that may lead to approximation errors particularly in high-dimensional settings [@Nott:2018]. CES is more restrictive due to use of an explicit Gaussian likelihood, but also leverages this structure to deal with high dimensional data.

Several other tools are available in other languages for a purpose of accelerated learning of the posterior distribution or posterior sampling. Two such examples, written in python, approximate the log-posterior distribution directly with a Gaussian process: [PyVBMC](https://github.com/acerbilab/pyvbmc) [@Huggins:2023] additionaly uses variational approximations to calculate the normalization constant, and [GPry](https://github.com/jonaselgammal/GPry) [@Gammal:2022], which iteratively trains the GP with an active training point selection algorithm. Such algorithms are distinct from CES, which approximates the parameter-to-data map with the Gaussian process, and advocates ensemble Kalman methods to select training points.

Expand Down
Binary file modified sinusoid_GP_emulator_contours.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified sinusoid_MCMC_hist_GP.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified sinusoid_eki_pairs.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified sinusoid_true_vs_observed_signal.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

0 comments on commit 2827e71

Please sign in to comment.