examples section and more notes

CliMA · Oct 11, 2023 · 53062ec · 53062ec
1 parent 66b4b0e
commit 53062ec
Showing 1 changed file with 16 additions and 10 deletions.
diff --git a/docs/src/random_feature_emulator.md b/docs/src/random_feature_emulator.md
@@ -3,11 +3,11 @@
 !!! note "Have a go with Gaussian processes first"
     We recommend that users first try `GaussianProcess` for their problems. As random features are a more recent tool, the training procedures and interfaces are still experimental and in development. 
 
-Random features provide a flexible framework to approximates a Gaussian process. Using random sampling of features to approximate a low-rank factorization of the Gaussian process kernel, the method is scalable (in numbers of training points, input-output dimensions). In the infinite sample limit, it is known how random features with certain given feature distributions converge to known kernel families.
+Random features provide a flexible framework to approximates a Gaussian process. Using random sampling of features, the method is a low-rank approximation leading to advantageous scaling properties (with the number of training points, input, and output dimensions). In the infinite sample limit, there are often (known) explicit Gaussian process kernels that the random feature representation converges to.
 
-We provide two types of `MachineLearningTool` for RandomFeatures, the `ScalarRandomFeatureInterface` and the `VectorRandomFeatureInterface`.
+We provide two types of `MachineLearningTool` for random feature emulation, the `ScalarRandomFeatureInterface` and the `VectorRandomFeatureInterface`.
 
-The `ScalarRandomFeatureInterface` closely mimics the role of a `GaussianProcess` package, by training a scalar-output function distribution. It can be applied to multidimensional output problems as with `GaussianProcess` by relying on a decorrelation of the output space, followed by training a series of independent scalar functions (all computed internally using the `Emulator` object).
+The `ScalarRandomFeatureInterface` closely mimics the role of a `GaussianProcess` package, by training a scalar-output function distribution. It can be applied to multidimensional output problems (as with `GaussianProcess`) by relying on data processing tools, such as performed when the `decorrelate=true` keyword argument is provided to the `Emulator`.
 
 The `VectorRandomFeatureInterface`, when applied to multidimensional problems, directly trains a function distribution between multi-dimensional spaces. This approach is not restricted to the data processing of the scalar method (though this can still be helpful). It can be cheaper to evaluate, but on the other hand the training can be more challenging/computationally expensive.
 
@@ -55,7 +55,7 @@ scalar_default_kernel = SeparableKernel(LowRankFactor(Int(ceil(sqrt(input_dim)))
 vector_default_kernel = SeparableKernel(LowRankFactor(Int(ceil(sqrt(output_dim)))), LowRankFactor(Int(ceil(sqrt(output_dim)))))
 ```
 !!! note "Relating covariance structure and training"
-    The parallels between random feature and gaussian process also extends to the hyperparameter learning. For example,
+    The parallels between random feature and Gaussian process also extends to the hyperparameter learning. For example,
     - A `ScalarRandomFeatureInterface` with a `DiagonalFactor` input covariance structure approximates a Gaussian process with automatic relevance determination (ARD) kernel, where one learns a lengthscale in each dimension of the input space
 
 ## The `optimizer_options` keyword - for performance
@@ -92,7 +92,8 @@ We suggest looking at the [`EnsembleKalmanProcesses`](https://github.com/CliMA/E
 - If `n_e` becomes less than the number of hyperparameters, the updates will fail and a localizer must be specified in `loc`.
 - If the algorithm terminates at `T=1` and resulting emulators looks unacceptable one can change or add arguments in `sch` e.g. `DataMisfitController("on_terminate"=continue)`
 
-widely robust defaults here are a work in progress
+!!! note
+    Widely robust defaults here are a work in progress
 
 ## Key methods
 
@@ -101,14 +102,16 @@ To interact with the kernel/covariance structures we have standard `get_*` metho
 - `calculate_n_hyperparameters(in_dim, out_dim, kernel_structure)` calculates the number of hyperparameters created by using the given kernel structure (can be applied to the covariance structure individually too)
 - `build_default_priors(in_dim, out_dim, kernel_structure)` creates a `ParameterDistribution` for the hyperparameters based on the kernel structure. This serves as the initialization of the training procedure.
 
-### Example: 5D-to-1D at defaults (scalar)
+## Example families and their hyperparameters
+
+### Scalar: ``\mathbb{R}^5 \to \mathbb{R}`` at defaults
 ```julia
 using CalibrateEmulateSample.Emulators
 input_dim = 5
 # build the default scalar kernel directly (here it will be a rank-3 perturbation from the identity)
 scalar_default_kernel = SeparableKernel(
     cov_structure_from_string("lowrank", input_dim),
-    cov_structure_from_string("onedim",1)
+    cov_structure_from_string("onedim", 1)
 ) 
 
 calculate_n_hyperparameters(input_dim, scalar_default_kernel) 
@@ -120,7 +123,7 @@ build_default_prior(input_dim, scalar_default_kernel)
 # 15-dim unbounded distribution 'input_lowrank_U'
 # 1-dim positive distribution `sigma`
 ```
-### Example 25D to 50D at defaults (vector)
+### Vector, separable: ``\mathbb{R}^{25} \to \mathbb{R}^{50}`` at defaults
 Or take a diagonalized 8-dimensional input, and assume full 6-dimensional output
 
 ```julia
@@ -145,8 +148,11 @@ build_default_prior(input_dim, output_dim, vector_default_kernel)
 # 1-dim postive distribution `sigma`
 ```
 
-### Example 25D to 50D (nonseparable vector)
-We see how the low-rank kernels are useful, when investigating the most general kernel case. The following is far too general, leading to large numbers of hyperparameters
+### Vector, nonseparable: ``\mathbb{R}^{25} \to \mathbb{R}^{50}`` 
+The following represents the most general kernel case.
+
+!!! note "Use low-rank/diagonls representations where possible"
+    The following is far too general, leading to large numbers of hyperparameters
 ```julia
 using CalibrateEmulateSample.Emulators
 input_dim = 25