diff --git a/.github/workflows/pkgdown-netlify-preview.yaml b/.github/workflows/pkgdown-netlify-preview.yaml index 2c51f01d..ffbb3e77 100644 --- a/.github/workflows/pkgdown-netlify-preview.yaml +++ b/.github/workflows/pkgdown-netlify-preview.yaml @@ -54,12 +54,20 @@ jobs: branch: gh-pages folder: docs + - id: deploy-dir + name: Determine dev status + run: | + if [[ $(grep -c -E 'sion. ([0-9]*\.){3}' ${{ github.workspace }}/DESCRIPTION) == 1 ]]; then + echo 'dir=./docs/dev' >> $GITHUB_OUTPUT + else + echo 'dir=./docs' >> $GITHUB_OUTPUT + fi - name: Deploy PR preview to Netlify if: contains(env.isPush, 'false') id: netlify-deploy - uses: nwtgck/actions-netlify@v2 + uses: nwtgck/actions-netlify@v3 with: - publish-dir: './docs' + publish-dir: '${{ steps.deploy-dir.outputs.dir }}' production-branch: main github-token: ${{ secrets.GITHUB_TOKEN }} deploy-message: diff --git a/DESCRIPTION b/DESCRIPTION index 05f1b2b7..37e48809 100644 --- a/DESCRIPTION +++ b/DESCRIPTION @@ -55,9 +55,11 @@ Imports: yaml Suggests: covr, + DT, gert, kableExtra, mockery, + pak, readr, rmarkdown, testthat (>= 3.2.0), diff --git a/NEWS.md b/NEWS.md index 641b4ef4..414715c9 100644 --- a/NEWS.md +++ b/NEWS.md @@ -1,5 +1,9 @@ # hubValidations (development version) +* Added: + - new vignette on how to create custom validation checks for hub validations (#121) + - new section on how to manage additional dependencies required by custom validation functions (#22). + # hubValidations 0.7.0 * Added function `create_custom_check()` for creating custom validation check function files from templates (#121). diff --git a/R/check_tbl_unique_round_id.R b/R/check_tbl_unique_round_id.R index 5426fcfc..33795037 100644 --- a/R/check_tbl_unique_round_id.R +++ b/R/check_tbl_unique_round_id.R @@ -2,7 +2,7 @@ #' #' @param round_id_col Character string. The name of the column containing #' `round_id`s. Usually, the value of round property `round_id` in hub `tasks.json` -#' config file. +#' config file. Defaults to `NULL` and determined from the config if applicable. #' @inheritParams check_tbl_colnames #' @return #' Depending on whether validation has succeeded, one of: diff --git a/_pkgdown.yml b/_pkgdown.yml index a521be12..f6ea9914 100644 --- a/_pkgdown.yml +++ b/_pkgdown.yml @@ -67,8 +67,11 @@ navbar: - text: Validating submissions locally href: articles/validate-submission.html - text: ------- - - text: Including custom validation functions - href: articles/custom-functions.html + - text: "Custom validation checks" + - text: Writing custom validation functions + href: articles/writing-custom-fns.html + - text: Deploying custom validation functions + href: articles/deploying-custom-functions.html development: mode: auto diff --git a/man/check_tbl_match_round_id.Rd b/man/check_tbl_match_round_id.Rd index c29eb7a1..46d65606 100644 --- a/man/check_tbl_match_round_id.Rd +++ b/man/check_tbl_match_round_id.Rd @@ -24,7 +24,7 @@ files within the \code{hub-config} directory.} \item{round_id_col}{Character string. The name of the column containing \code{round_id}s. Usually, the value of round property \code{round_id} in hub \code{tasks.json} -config file.} +config file. Defaults to \code{NULL} and determined from the config if applicable.} } \value{ Depending on whether validation has succeeded, one of: diff --git a/man/check_tbl_unique_round_id.Rd b/man/check_tbl_unique_round_id.Rd index a1b26f72..2dd36d19 100644 --- a/man/check_tbl_unique_round_id.Rd +++ b/man/check_tbl_unique_round_id.Rd @@ -24,7 +24,7 @@ files within the \code{hub-config} directory.} \item{round_id_col}{Character string. The name of the column containing \code{round_id}s. Usually, the value of round property \code{round_id} in hub \code{tasks.json} -config file.} +config file. Defaults to \code{NULL} and determined from the config if applicable.} } \value{ Depending on whether validation has succeeded, one of: diff --git a/man/check_valid_round_id_col.Rd b/man/check_valid_round_id_col.Rd index 7510ef75..e4693c7c 100644 --- a/man/check_valid_round_id_col.Rd +++ b/man/check_valid_round_id_col.Rd @@ -25,7 +25,7 @@ files within the \code{hub-config} directory.} \item{round_id_col}{Character string. The name of the column containing \code{round_id}s. Usually, the value of round property \code{round_id} in hub \code{tasks.json} -config file.} +config file. Defaults to \code{NULL} and determined from the config if applicable.} } \value{ Depending on whether validation has succeeded, one of: diff --git a/man/validate_model_data.Rd b/man/validate_model_data.Rd index 4ffa95e1..8e98e376 100644 --- a/man/validate_model_data.Rd +++ b/man/validate_model_data.Rd @@ -30,7 +30,7 @@ the hub's model-output directory.} \item{round_id_col}{Character string. The name of the column containing \code{round_id}s. Usually, the value of round property \code{round_id} in hub \code{tasks.json} -config file.} +config file. Defaults to \code{NULL} and determined from the config if applicable.} \item{output_type_id_datatype}{character string. One of \code{"from_config"}, \code{"auto"}, \code{"character"}, \code{"double"}, \code{"integer"}, \code{"logical"}, \code{"Date"}. diff --git a/man/validate_submission.Rd b/man/validate_submission.Rd index cbbee86a..fe6cae0d 100644 --- a/man/validate_submission.Rd +++ b/man/validate_submission.Rd @@ -33,7 +33,7 @@ the hub's model-output directory.} \item{round_id_col}{Character string. The name of the column containing \code{round_id}s. Usually, the value of round property \code{round_id} in hub \code{tasks.json} -config file.} +config file. Defaults to \code{NULL} and determined from the config if applicable.} \item{validations_cfg_path}{Path to \code{validations.yml} file. If \code{NULL} defaults to \code{hub-config/validations.yml}.} diff --git a/tests/testthat/_snaps/check_tbl_values_required.md b/tests/testthat/_snaps/check_tbl_values_required.md index 3f0d2be4..4a3aa314 100644 --- a/tests/testthat/_snaps/check_tbl_values_required.md +++ b/tests/testthat/_snaps/check_tbl_values_required.md @@ -224,7 +224,8 @@ --- Code - check_for_errors(validate_submission(hub_path, file_path)) + check_for_errors(validate_submission(hub_path, file_path, + skip_submit_window_check = TRUE)) Message -- 2024-10-02-UMass-HMLR.parquet ---- diff --git a/tests/testthat/test-check_tbl_values_required.R b/tests/testthat/test-check_tbl_values_required.R index 0aba4c07..48aecad0 100644 --- a/tests/testthat/test-check_tbl_values_required.R +++ b/tests/testthat/test-check_tbl_values_required.R @@ -277,7 +277,10 @@ test_that("(#123) check_tbl_values_required works with all optional output types ) # Ensure that req_vals check is the only one that fails expect_snapshot( - check_for_errors(validate_submission(hub_path, file_path)), + check_for_errors(validate_submission( + hub_path, file_path, + skip_submit_window_check = TRUE + )), error = TRUE ) }) diff --git a/vignettes/articles/children/_add-deps-pkg.Rmd b/vignettes/articles/children/_add-deps-pkg.Rmd new file mode 100644 index 00000000..9985a674 --- /dev/null +++ b/vignettes/articles/children/_add-deps-pkg.Rmd @@ -0,0 +1,24 @@ +### Deploying custom functions as a package + +To deploy custom functions managed as a package in `src/validations`, you can use the `pkg` configuration property in the `validations.yml` file to specify the package namespace. + +For example, if you have created a simple package in `src/validations/` with a `cstm_check_tbl_example.R` script containing the specification of an `cstm_check_tbl_example()` function in `src/validations/R`, you can use the following configuration in your `validation.yml` file to source the function from the installed `validations` package namespace: + +``` +default: + validate_model_data: + custom_check: + fn: "cstm_check_tbl_example" + pkg: "validations" +``` + +To ensure the package (and any additional dependencies it depends on) is installed and available during validation, you must add the package to the `setup-r-dependencies` step in the `hubverse-actions` `validate-submission.yaml` GitHub Action workflow of your hub like so: + +```yaml + - uses: r-lib/actions/setup-r-dependencies@v2 + with: + packages: | + any::hubValidations + any::sessioninfo + local::./src/validations +``` diff --git a/vignettes/articles/children/_add-deps-source.Rmd b/vignettes/articles/children/_add-deps-source.Rmd new file mode 100644 index 00000000..c24029f8 --- /dev/null +++ b/vignettes/articles/children/_add-deps-source.Rmd @@ -0,0 +1,53 @@ + +## Available dependencies + +**All `hubValidations` exported functions are available** for use in your custom check functions as well as functions from hubverse packages **`huUtils`**, **`hubAdmin`** and **`hubData`**. + +```{r, echo=FALSE} +get_deps <- function(pkg) { + suppressMessages(pak::pkg_deps(pkg)) +} +memoise_pkg_deps <- memoise::memoise(get_deps) +pkgs <- memoise_pkg_deps(".")[, c("package", "version")] +``` + +In addition, **functions in packages from the `hubValidations` dependency tree are also generally available**, both locally (once `hubValidations` is installed) and in the hubverse `validate-submission` GitHub Action. + +Functions from these packages can be used in your custom checks without specifying them as additional dependencies. + +```{r, echo=FALSE} +pkgs[order(pkgs$package), ] |> + DT::datatable(rownames = FALSE) +``` + + +## Additional dependencies + +If any custom functions you are deploying depend on additional packages, you will need to ensure these packages are available during validation. + +The simplest way to ensure they are available is to edit the `setup-r-dependencies` step in the `hubverse-actions` [`validate-submission.yaml`](https://github.com/hubverse-org/hubverse-actions/blob/main/validate-submission/validate-submission.yaml) GitHub Action workflow of your hub and add any additional dependency to the `packages` field list. + +In the following pseudo example we add `additionalPackage` package to the list of standard dependencies: + +```yaml + - uses: r-lib/actions/setup-r-dependencies@v2 + with: + packages: | + any::hubValidations + any::sessioninfo + any::additionalPackage +``` + +Note that this ensures the additional dependency is available during validation on GitHub but does not guarantee it will be installed locally for hub administrators or submitting teams. Indeed such missing dependencies could lead to execution errors in custom checks when running `validate_submission()` locally. + +You could use documentation, like your hub's README to communicate additional required dependencies for validation to submitting teams. Even better, you could add a check to the top of your function to catch missing dependencies and provide a helpful error message to the user. + +```{r, eval=FALSE} +if (!(requireNamespace("additionalPackage", quietly = TRUE))) { + stop( + "Package 'additionalPackage' must be installed to run the full validation check. + Please install and try again." + ) +} +``` + diff --git a/vignettes/articles/children/_custom-fn-available-args.Rmd b/vignettes/articles/children/_custom-fn-available-args.Rmd new file mode 100644 index 00000000..e3522eb8 --- /dev/null +++ b/vignettes/articles/children/_custom-fn-available-args.Rmd @@ -0,0 +1,20 @@ +Each of the `validate_*()` functions **contain a number of standard objects in their call environment** which are **available for downstream check functions to use as arguments** and are **passed automatically to arguments** of optional/custom functions **with the same name**. Therefore, values for such arguments do not need including in function deployment configuration but [**can be overridden through a function's `args` configuration**](deploying-custom-functions.html#deploying-optional-hubvalidations-functions) in `validations.yml` during deployment. + +**All `validate_*()` functions will contain the following five objects in their caller environment:** + + - **`file_path`**: character string of path to file being validated relative to the `model-output` directory. + - **`hub_path`**: character string of path to hub. + - **`round_id`**: character string of `round_id` derived from the model file name. + - **`file_meta`**: named list containing `round_id`, `team_abbr`, `model_abbr` and `model_id` details. + - **`validations_cfg_path`**: character string of path to `validations.yml` file. Defaults to `hub-config/validations.yml`. + +**`validate_model_data()` will contain the following additional objects:** + + - **`tbl`**: a tibble of the model output data being validated. + - **`tbl_chr`**: a tibble of the model output data being validated with all columns coerced to character type. + - **`round_id_col`**: character string of name of `tbl` column containing `round_id` information. Defaults to `NULL` and usually determined from the `tasks.json` config if applicable unless explicitly provided as an argument to `validate_model_data()`. + - **`output_type_id_datatype`**: character string. The value of the `output_type_id_datatype` argument. This value is useful in functions like `hubData::create_hub_schema()` or `hubValidations::expand_model_out_grid()` to set the data type of `output_type_id` column. + - **`derived_task_ids`**: character vector or `NULL`. The value of the `derived_task_ids` argument, i.e. the names of task IDs whose values depend on other task IDs. + + +The `args` configuration can be used to override objects from the caller environment as well as defaults during deployment. diff --git a/vignettes/articles/custom-functions.Rmd b/vignettes/articles/deploying-custom-functions.Rmd similarity index 71% rename from vignettes/articles/custom-functions.Rmd rename to vignettes/articles/deploying-custom-functions.Rmd index 351ed9bd..e4aec068 100644 --- a/vignettes/articles/custom-functions.Rmd +++ b/vignettes/articles/deploying-custom-functions.Rmd @@ -1,5 +1,5 @@ --- -title: "Include custom validation functions" +title: "Deploying custom validation functions" --- ```{r, include = FALSE} @@ -17,7 +17,7 @@ library(hubValidations) Custom validation functions can be included and configured within standard `hubValidation` workflows by **including a `validations.yml` file in the `hub-config` directory**. Alternatively, an appropriately structured file can be included at a different location and the path to the file provided through argument `validations_cfg_path`. `hubValidations` uses the [`config`](https://rstudio.github.io/config/articles/inheritance.html) package to get validation configuration. This allows for configuration inheritance and the ability to include executable R code. -See the `confog` package vignette on [inheritance and R expressions](https://rstudio.github.io/config/articles/inheritance.html) for more details. +See the `config` package vignette on [inheritance and R expressions](https://rstudio.github.io/config/articles/inheritance.html) for more details. ## `validations.yml` structure @@ -34,37 +34,24 @@ Within the default configuration, individual checks can be configured for each o - **`fn`:** The name of the check function to be run, as character string (required). - **`pkg`:** The name of the package namespace from which to get check function. Must be supplied if function is distributed as part of a package. - **`source:`** Path to `.R` script containing function code to be sourced. If relative, should be relative to the hub's directory root. Must be supplied if function is not part of a package and only exists as a script. - - **`args`:** A yaml dictionary of key/value pairs or arguments to be passed to the custom function. Values can be yaml lists or even executable R code (optional). + - **`args`:** A yaml dictionary of key/value pairs of arguments and their values to be passed to the custom function. Values can be yaml lists or even executable R code (optional). -Note that each of the `validate_*()` functions contain a standard objects in their call environment which are passed automatically to any custom check function and therefore do not need including in the `args` configuration. - -- **`validate_model_file`:** - - `file_path`: character string of path to file being validated relative to the `model-output` directory. - - `hub_path`: character string of path to hub. - - `round_id`: character string of `round_id` - - `file_meta`: named list containing `round_id`, `team_abbr`, `model_abbr` and `model_id` details. -- **`validate_model_data`:** - - `tbl`: a tibble of the model output data being validated. - - `file_path`: character string of path to file being validated relative to the `model-output` directory. - - `hub_path`: character string of path to hub. - - `round_id`: character string of `round_id` - - `file_meta`: named list containing `round_id`, `team_abbr`, `model_abbr` and `model_id` details. - - `round_id_col`: character string of name of `tbl` column containing `round_id` information. -- **`validate_model_metadata`:** - - `file_path`: character string of path to file being validated relative to the `model-output` directory. - - `hub_path`: character string of path to hub. - - `round_id`: character string of `round_id` - - `file_meta`: named list containing `round_id`, `team_abbr`, `model_abbr` and `model_id` details. - -The `args` configuration can be used to override objects from the caller environment as well as defaults. - - -Here's an example configuration for a single check (`opt_check_tbl_horizon_timediff()`) to be run as part of the `validate_model_data()` validation function which checks the content of the model data submission files. + +```{r child="children/_custom-fn-available-args.Rmd", echo=FALSE, results="asis"} +``` + +#### Deploying optional `hubValidations` functions + +Here's an example configuration for a single optional `hubValidations` check, `opt_check_tbl_horizon_timediff()`, which checks that the temporal difference between the values in two date columns (defined by additional arguments `t0_colname` & `t1_colname`) is equal to a time period defined by horizon values (contained in a column defined by `horizon_colname`) and the length of a single horizon defined by argument `timediff`. + +The check is to be run as part of the `validate_model_data()` validation function which checks the content of the model data submission files. ```{r, eval=FALSE, code=readLines(system.file('testhubs/flusight/hub-config/validations.yml', package = 'hubValidations'))} ``` -The above configuration file relies on default values for arguments `horizon_colname` (`"horizon"`) and `timediff` (`lubridate::weeks()`). We can use the `validations.yml` `args` list to override the default values. Here's an example that includes **executable r code** as the value of an argument. +The above configuration file relies on default values for arguments `horizon_colname` (`"horizon"`) and `timediff` (`lubridate::weeks()`). We can **use the `validations.yml` `args` list to override the `horizon_colname` and `timediff` argument default values**. + +In this example, we **also include executable r code** as the value of the `timediff` argument. ``` default: @@ -79,6 +66,19 @@ default: timediff: !expr lubridate::weeks(2) ``` +#### Deploying custom functions + +The above example involved an optional `hubValidation` function. To deploy a custom function that is not part of the `hubValidations` or any other package, you should store the script containing the function in the `src/validations/R/` directory (relative to the root of your hub) and include the path to the script in the `source` argument in the configuration file. + +``` +default: + validate_model_data: + custom_check: + fn: "cstm_check_tbl_example" + source: "src/validations/R/cstm_check_tbl_example.R" +``` + + ### Round specific configuration Additional round specific configurations can be included in `validations.yml` that can add to or override default configurations. @@ -159,6 +159,12 @@ arrow::read_csv_arrow(system.file("check_table.csv", package = "hubValidations") ``` -## Managing dependencies of custom sourced functions +# Managing dependencies of custom functions -TODO +If any custom functions you are deploying depend on additional packages, you will need to ensure these packages are available during validation. + +```{r child="children/_add-deps-source.Rmd", echo=FALSE, results="asis"} +``` + +```{r child="children/_add-deps-pkg.Rmd", echo=FALSE, results="asis"} +``` diff --git a/vignettes/articles/validate-pr.Rmd b/vignettes/articles/validate-pr.Rmd index 896ac158..01a6755f 100644 --- a/vignettes/articles/validate-pr.Rmd +++ b/vignettes/articles/validate-pr.Rmd @@ -17,7 +17,7 @@ library(hubValidations) The `validate_pr()` functions is designed to be used to validate team submissions through Pull Requests on GitHub. Only model output and model metadata files are individually validated using `validate_submission()` or `validate_model_metadata()` respectively on each file according to file type -(_See the end of this article for details of the standard checks performed on each file. For more information on deploying optional or custom functions please check the article on [including custom functions](articles/custom-functions.html) (`vignette("custom-functions")`)_). +(_See the end of this article for details of the standard checks performed on each file. For more information on deploying optional or custom functions please check the article on [including custom functions](deploying-custom-functions.html) (`vignette("articles/deploying-custom-functions")`)_). As part of checks, however, hub config files are also validated. Any other files included in the PR are ignored but flagged in a message. @@ -76,7 +76,7 @@ Supplying the names of derived task IDs to argument `derived_task_ids` will igno #### Warning -Ignoring derived task IDs means that the validity of derived task ID value combinations will not be check. It is therefore **important to ensure that the values of derived task IDs are correctly derived from other task IDs through custom checks**. For example, the values of `target_end_date` can be checked by deploying optional check `opt_check_tbl_horizon_timediff()`. See the article on [including custom functions](articles/custom-functions.html) for more information. +Ignoring derived task IDs means that the validity of derived task ID value combinations will not be check. It is therefore **important to ensure that the values of derived task IDs are correctly derived from other task IDs through custom checks**. For example, the values of `target_end_date` can be checked by deploying optional check `opt_check_tbl_horizon_timediff()`. See the article on [including custom functions](deploying-custom-functions.html) for more information. @@ -215,7 +215,7 @@ check_for_errors(v_pass, verbose = TRUE) ## `validate_pr` check details -For details on the structure of `` objects, including on how to access more information about specific checks, see `vignette("hub-validations-class")`. +For details on the structure of `` objects, including on how to access more information about specific checks, see `vignette("articles/hub-validations-class")`. ### Checks on model output files @@ -260,6 +260,6 @@ arrow::read_csv_arrow(system.file("check_table.csv", package = "hubValidations") #### Custom checks -The standard checks discussed here are the checks deployed by default by the `validate_pr` function. For more information on deploying optional or custom functions please check the article on [including custom functions](articles/custom-functions.html) (`vignette("custom-functions")`). +The standard checks discussed here are the checks deployed by default by the `validate_pr` function. For more information on deploying optional or custom functions please check the article on [deploying custom functions](deploying-custom-functions.html) (`vignette("articles/deploying-custom-functions")`). diff --git a/vignettes/articles/validate-submission.Rmd b/vignettes/articles/validate-submission.Rmd index f8717b3a..281b4f9c 100644 --- a/vignettes/articles/validate-submission.Rmd +++ b/vignettes/articles/validate-submission.Rmd @@ -29,7 +29,7 @@ validate_submission(hub_path, ) ``` -For more details on the structure of `` objects, including how to access more information on individual checks, see `vignette("hub-validations-class")`. +For more details on the structure of `` objects, including how to access more information on individual checks, see `vignette("articles/hub-validations-class")`. ### Validation early return @@ -136,7 +136,7 @@ validate_model_metadata(hub_path, ``` -For more details on the structure of `` objects, including how to access more information on individual checks, see `vignette("hub-validations-class")`. +For more details on the structure of `` objects, including how to access more information on individual checks, see `vignette("articles/hub-validations-class")`. ### `validate_model_metadata` check details @@ -163,6 +163,6 @@ arrow::read_csv_arrow(system.file("check_table.csv", package = "hubValidations") #### Custom checks The standard checks discussed here are the checks deployed by default by the `validate_submission` or `validate_model_metadata` functions. -For more information on deploying optional/custom functions or functions that require configuration please check the article on [including custom functions](articles/custom-functions.html) (`vignette("custom-functions")`). +For more information on deploying optional/custom functions or functions that require configuration please check the article on [including custom functions](deploying-custom-functions.html) (`vignette("articles/deploying-custom-functions")`). diff --git a/vignettes/articles/writing-custom-fns.Rmd b/vignettes/articles/writing-custom-fns.Rmd new file mode 100644 index 00000000..04034618 --- /dev/null +++ b/vignettes/articles/writing-custom-fns.Rmd @@ -0,0 +1,264 @@ +--- +title: "Writing custom validation functions" +--- + +```{r, include = FALSE} +knitr::opts_chunk$set( + collapse = TRUE, + comment = "#>" +) + +r_path <- function(hub_path, file_name) { + fs::path(hub_path, "src", "validations", "R", file_name, ext = "R") +} +``` + +```{r setup} +library(hubValidations) +``` + +`hubValidations` provides a wide range of validation `check_*()` functions, but there are times when you might need to write your own custom check functions to check a specific aspect of your hub's submissions. + +This guide will help you understand how to write custom check functions and what tools are available in `hubValidations` to help. + +Custom functions are configured through the `validations.yml` file and executed as part of `validate_model_data()`, `validate_model_metadata()` and `validate_model_file()` functions. More details about deploying custom check functions during validation workflows are available in **`vignette("articles/deploying-custom-functions")`**. + +## Anatomy of a check function + +While source code of existing `hubValidations` `check_*()` functions can be a good place to start when writing custom check functions, it is important to understand the structure of a check function, particularly the expected inputs and outputs. + +At it's most basic, a custom check function should: + +- take in a set of inputs to be validated +- evaluate whether a condition is met +- return an appropriate check condition object + +In addition, if the check condition is not met, it's also helpful to capture any details that can guide users towards specifics of the failure and how to fix it + +In general, `hubValidations` check functions evaluate conditions with respect to one or more of the following: + +- Model output submission files +- Model output submission file content (i.e data) +- Model metadata files + +## `create_custom_check()` for creating custom check function templates + +To help you get started on the right path, we also provide function `create_custom_check()` for creating a basic custom check function from a template. + +The function requires a name for the new custom check function, e.g. `"example_check"`. It then creates an `.R` script file named after the function (`example_check.R`) and saves it in the hub at the recommended location: `src/validations/R/`. The script contains basic skeleton code to create a custom check function called `example_check`. + +The output of `create_custom_check()` can also be parametarised through a number of arguments to include additional template code snippets (see below for examples). + +Let's take a look at the basic structure of a custom check function created by `create_custom_check()`. + +We'll start by creating a temporary "hub" for us to work in, but if you have an existing hub, you can work in there. + +```{r} +hub_path <- withr::local_tempdir() + +create_custom_check("cstm_check_tbl_basic", + hub_path = hub_path +) +``` + +The contents of the created file at `src/validations/R/cstm_check_tbl_basic.R` are as follows: + + +```{r, code=readLines(r_path(hub_path, "cstm_check_tbl_basic")), eval=FALSE, comment=NA} +``` + + + + + +## Function inputs / arguments + +The minimum inputs required by a custom check function depend on the type of check being performed. + +- **`file_path`**: the relative path to the submission file being validated is required for all check functions. **`file_path` must therefore be included as an argument** in all custom check function. +- **`tbl`** or **`tbl_chr`**: a **tibble representation of the contents of a model output** submission—with column data types matching the hub schema (`tbl`) or an all character version (`tbl_chr`)—is also **required by any checks that operate on the data** in the submission file. + +Since `file_path` and `tbl` are the most common inputs to check functions, `create_custom_check()` includes them as arguments by default. This means that the **custom check function will include these objects in the function call environment by default**. + + + +In addition to these, custom check functions can also have additional arguments for inputs required by the check. Some of these inputs **are available in the check caller environment** and can be passed automatically to a custom check function by including an argument with the same name as the input object required in the custom function formals. Other inputs can be passed explicitly to function arguments through [the functions `args` field](deploying-custom-functions#validations-yml-structure) when configuring the `validations.yml` file. + +### Arguments available in the caller environment + +```{r child="children/_custom-fn-available-args.Rmd", echo=FALSE, results="asis"} +``` + + + +### Additional arguments + +You can add additional arguments to custom check functions and pass values to them by including them in the `args` configuration in the `validations.yml` file. These values are passed to the custom check function by `hubValidations` when the function is called. + + +If you do add additional arguments to a custom check function, you should also add input checks at the start of the function to ensure inputs are valid. [The `checkmate` package](https://mllg.github.io/checkmate/) contains a wide range of functions for checking inputs. + +For example, the optional check `opt_check_tbl_col_timediff()` (which is deployed in exactly the same fashion as custom functions, i.e. through the `validations.yml` file) takes additional arguments `t0_colname`, `t1_colname` and `timediff`. + + +```{r, comment = NA} +opt_check_tbl_col_timediff +``` + + +You can add an example extra argument with `extra_args = TRUE` when creating the custom check function with `create_custom_check()`. + +```{r} +create_custom_check("cstm_check_tbl_args", + hub_path = hub_path, + extra_args = TRUE +) +``` + +This adds an extra example argument `extra_arg` to the custom check function formals as well as an example input check to the top of the function body. + +```{r, code=readLines(r_path(hub_path, "cstm_check_tbl_args")), eval=FALSE, comment=NA} +``` + + +## Function output + +### Capturing and returning check results with `capture_check_cnd()` + +The `capture_check_cnd()` function is used to return a check condition (success, failure, or error) and it's output is what a custom check function should return in most cases (see below for exception). The function returns a `` class object depending on the value passed to the `check` argument, which represents the summary of the condition being checked by a given validation function. + +If the value passed to `check ` is `TRUE`, the function returns a `` class object. + +If the value is `FALSE`, the output depends on the `error` argument. + + +- If `error` is `FALSE` (the default), the function returns a `` class object, which indicates the check has failed. +- If `error` is `TRUE`, the function returns a `` class object, which indicates the check has failed and additionally causes execution of further custom validation functions to halt. Set `error = TRUE` if downstream checks cannot be run if the current check fails. + +```{r} +create_custom_check("cstm_check_tbl_error", + hub_path = hub_path, error = TRUE +) +``` + +```{r, code=readLines(r_path(hub_path, "cstm_check_tbl_error")), eval=FALSE, comment=NA} +``` + + +### Skipping checks and returning a message with `capture_check_info()` + + +Sometimes a check function might not always be applicable and a pre-condition needs to be met before the main check itself is performed. **If the pre-condition is not met, the check is usually skipped**. + +**For such checks, the function should return a `` object**, generated by the `capture_check_info()` function. Use the `msg` argument to explain that a check was skipped and why. + + +```{r} +capture_check_info( + "modelA-teamA/2024-09-12-modelA-teamA", + "Condition for running this check was not met. Skipped." +) +``` + +For example, the `check_tbl_value_col_ascending()` check function which validates that values are ascending when arranged by increasing `output_type_id` order is only applicable to `cdf` and `quantile` output types. Before proceeding with the main check, the function first checks whether the model output `tbl` contains data for `cdf` and `quantile` output types. If not, the check is skipped. + +```{r, comment = NA} +check_tbl_value_col_ascending +``` + + +You can add a pre-condition check block of code with argument `conditional = TRUE` when creating the custom check function with `create_custom_check()`. + +```{r} +create_custom_check("cstm_check_tbl_skip", + hub_path = hub_path, + conditional = TRUE +) +``` + + +```{r, code=readLines(r_path(hub_path, "cstm_check_tbl_skip")), eval=FALSE, comment=NA} +``` + + +## Loading config files + +Many checks are conditioned against information stored in hub configuration files and these are often read in at the start of the custom check function. + +The easiest way to make hub configuration information available within your function is to pass the `hub_path` caller environment object by specifying it as a function argument and then use `hubUtils::read_config(hub_path)` to read in the `tasks.json` configuration file. + +You can add a `config = TRUE` argument when creating the custom check function with `create_custom_check()` to include the `hub_path` argument and insert a code snippet in you custom check function skeleton that reads in the `tasks.json` hub configuration file. + + +```{r} +create_custom_check("cstm_check_tbl_config", + hub_path = hub_path, + config = TRUE +) +``` + + +```{r, code=readLines(r_path(hub_path, "cstm_check_tbl_config")), eval=FALSE, comment=NA} +``` + + +## Custom function dependencies + +When writing your functions you might want to use functions from other packages. + + +```{r child="children/_add-deps-source.Rmd", echo=FALSE, results="asis"} +``` + + +## Managing custom check functions as a package + +The simplest way to manage custom check functions is to store them as scripts in the `src/validations/R` directory in the root of the hub and source them during validation by specifying the path to a custom functions file via the `source:` property in `validations.yml`. + +Alternatively, you could **manage your custom functions as a package**. + +You can easily [turn the contents of `src/validations` into a local `validations` R package](https://r-pkgs.org/whole-game.html#create_package) with: + +```{r, eval=FALSE} +usethis::create_package("src/validations", open = FALSE) +``` + +This would allow you to: + +- Make your functions available locally by users who could use `pak::local_install("src/validations")` in the hub root to install the `validations` package. +- [Manage additional dependencies](https://r-pkgs.org/description.html#sec-description-imports-suggests) required by your custom functions formally through the `DESCRIPTION` file. +- [Formally test your custom functions](https://r-pkgs.org/testing-basics.html) using `testthat` tests in the `tests/testthat` directory. + + +```{r child="children/_add-deps-pkg.Rmd", echo=FALSE, results="asis"} +``` + +If you want to share custom functions across multiple hubs, you could even consider separating them into a standalone package and hosting them on GitHub.