-
Notifications
You must be signed in to change notification settings - Fork 4
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Browse files
Browse the repository at this point in the history
- Loading branch information
1 parent
fde519f
commit df290cb
Showing
2 changed files
with
320 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,183 @@ | ||
--- | ||
title: "Validating Pull Requests on GitHub" | ||
--- | ||
|
||
```{r, include = FALSE} | ||
knitr::opts_chunk$set( | ||
collapse = TRUE, | ||
comment = "#>" | ||
) | ||
``` | ||
|
||
```{r setup} | ||
library(hubValidations) | ||
``` | ||
|
||
## Running validation checks on a Pull Request with `validate_pr()` | ||
|
||
The `validate_pr()` functions is designed to be used to validate team submissions through Pull Requests on GitHub. | ||
Only model output and model metadata files are individually validated using `validate_submission()` on each file. | ||
As part of checks, however, hub config files are also validated. | ||
Any other files included in the PR are ignored but flagged in a message. | ||
|
||
### Deploying `validate_pr()` though a GitHub Action workflow | ||
|
||
The most common way to deploy `validate_pr()` is through a GitHub Action that triggers when a pull request containing changes to model output or model metadata files is opened. | ||
The hubverse maintains the [**`validate-submission.yaml`**](https://github.com/Infectious-Disease-Modeling-Hubs/hubverse-actions/tree/main/validate-submission) GitHub Action workflow template for deploying `validate_pr()`. | ||
|
||
The latest release of the workflow can be added to hub's GitHub Action workflows using the `hubCI` package: | ||
```{r, eval = FALSE} | ||
hubCI::use_hub_github_action("validate-submission") | ||
``` | ||
|
||
|
||
The pertinent section of the workflow is: | ||
|
||
```yaml | ||
- name: Run validations | ||
env: | ||
PR_NUMBER: ${{ github.event.number }} | ||
run: | | ||
library("hubValidations") | ||
v <- hubValidations::validate_pr( | ||
gh_repo = Sys.getenv("GITHUB_REPOSITORY"), | ||
pr_number = Sys.getenv("PR_NUMBER"), | ||
skip_submit_window_check = FALSE | ||
) | ||
hubValidations::check_for_errors(v, verbose = TRUE) | ||
shell: Rscript {0} | ||
``` | ||
where `validate_pr()` is called on the contents of the current Pull Request, the results (an S3 `<hub_validations>` class object) is stored in `v` and then `check_for_errors()` used to signal whether overall validations have passed or failed and summarise any validation failures. | ||
|
||
|
||
### Skipping submission window checks | ||
|
||
Most hubs require that model output files for a given round are submitted within a submission window [defined in the `"submission_due"` property of the `tasks.json` hub config file](https://hubdocs.readthedocs.io/en/latest/quickstart-hub-admin/tasks-config.html#setting-up-submissions-due). | ||
|
||
`validate_pr()` includes submission window checks for model output files and returns a `<warning/check_failure>` condition class object if a file is submitted outside the accepted submission window. | ||
|
||
To disable submission window checks, argument `skip_submit_window_check` can be set to `TRUE`. | ||
|
||
### Configuring file modification/deletion/renaming checks | ||
|
||
For most hubs, **modification, renaming or deletion of previously submitted model output files** or **deletion/renaming of previously submitted model metadata files** is not desirable without justification. They should therefore trigger validation failure and notify hub maintainers of the files affected. | ||
At the same time, most hubs prefer to allow modifications to model output files within their allowed submission window. | ||
|
||
Reflecting these preferences, by default, `validate_pr()` checks for modification, renaming or deletion of previously submitted model output files and deletion/renaming of previously submitted model metadata files and appends a `<error/check_error>` class objects to the output for each file modification/deletion/renaming detected. | ||
It does however allow modifications to model output files within their allowed submission window. | ||
|
||
|
||
```{r} | ||
temp_hub <- fs::path(tempdir(), "mod_del_hub") | ||
gert::git_clone( | ||
url = "https://github.com/Infectious-Disease-Modeling-Hubs/ci-testhub-simple", | ||
path = temp_hub, | ||
branch = "test-mod-del" | ||
) | ||
``` | ||
|
||
|
||
```{r} | ||
v <- validate_pr( | ||
hub_path = temp_hub, | ||
gh_repo = "Infectious-Disease-Modeling-Hubs/ci-testhub-simple", | ||
pr_number = 6, | ||
skip_submit_window_check = TRUE | ||
) | ||
v | ||
``` | ||
|
||
|
||
These settings can be modified if required though the use of arguments `file_modification_check` and `allow_submit_window_mods`. | ||
|
||
- **`file_modification_check`** controls whether modification/deletion checks are performed, what is returned if modifications/deletions are detected and accepts one of the following values: | ||
|
||
- **`"error"`**: Appends a `<error/check_error>` condition class object for each applicable modified/deleted file. Will result in validation workflow failure. | ||
- **`"warning"`**: Appends a `<warning/check_warning>` condition class object for each applicable modified/deleted file. Will result in validation workflow failure. | ||
- **`"message"`**: Appends a `<message/check_info>` condition class object for each applicable modified/deleted file. Will not result in validation workflow failure. | ||
- **`"none"`**: No modification/deletion checks performed. | ||
|
||
- **`allow_submit_window_mods`** controls whether modifications/deletions of model output files are allowed within their submission windows. Is set to `TRUE` by default but can be set to `FALSE` if modifications/deletions are not allowed, regardless of timing. | ||
Is ignored when checking model metadata files as well as when `file_modification_check` is set to `"none"`. | ||
|
||
|
||
<div class="alert alert-warning" role="alert"> | ||
|
||
#### Warning | ||
|
||
Note that to establish **relative** submission windows when performing modification/deletion checks and `allow_submit_window_mods` is `TRUE`, the reference date is taken as the `round_id` extracted from the file path. | ||
This is because we cannot extract dates from columns of deleted files. | ||
If hub submission window reference dates do not match round IDs in file paths, currently `allow_submit_window_mods` will not work correctly and is best set to `FALSE`. | ||
This only relates to hubs/rounds where submission windows are determined relative to a reference date and not when explicit submission window start and end dates are provided in the config. | ||
|
||
For more details on submission window config see [Setting up `"submission_due"`](https://hubdocs.readthedocs.io/en/latest/quickstart-hub-admin/tasks-config.html#setting-up-submissions-due) in the hubverse hubDocs. | ||
|
||
</div> | ||
|
||
|
||
## Checking for validation failures with `check_for_errors()` | ||
|
||
`check_for_errors()` is used to inspect a `hub_validations` class object, determine whether overall validations have passed or failed and summarise any detected errors/failures. | ||
|
||
### Validation failure | ||
|
||
If any elements of the `hub_validations` object contain `<error/check_error>`, `<warning/check_warning>` or `<error/check_exec_error>` condition class objects, the function throws an error and prints the messages from the failing checks. | ||
|
||
```{r, error=TRUE} | ||
temp_hub <- fs::path(tempdir(), "invalid_sb_hub") | ||
gert::git_clone( | ||
url = "https://github.com/Infectious-Disease-Modeling-Hubs/ci-testhub-simple", | ||
path = temp_hub, | ||
branch = "pr-missing-taskid" | ||
) | ||
v_fail <- validate_pr( | ||
hub_path = temp_hub, | ||
gh_repo = "Infectious-Disease-Modeling-Hubs/ci-testhub-simple", | ||
pr_number = 5, | ||
skip_submit_window_check = TRUE | ||
) | ||
check_for_errors(v_fail) | ||
``` | ||
|
||
### Validation success | ||
|
||
If all validations checks pass, `check_for_errors()` returns `TRUE` silently and prints: | ||
|
||
``` | ||
✔ All validation checks have been successful. | ||
``` | ||
|
||
```{r} | ||
temp_hub <- fs::path(tempdir(), "valid_sb_hub") | ||
gert::git_clone( | ||
url = "https://github.com/Infectious-Disease-Modeling-Hubs/ci-testhub-simple", | ||
path = temp_hub, | ||
branch = "pr-valid" | ||
) | ||
v_pass <- validate_pr( | ||
hub_path = temp_hub, | ||
gh_repo = "Infectious-Disease-Modeling-Hubs/ci-testhub-simple", | ||
pr_number = 4, | ||
skip_submit_window_check = TRUE | ||
) | ||
check_for_errors(v_pass) | ||
``` | ||
|
||
|
||
### Verbose output | ||
|
||
If printing the results of all checks is preferred instead of just summarising the results of checks that failed, argument `verbose` can be set to `TRUE`. | ||
|
||
```{r, error=TRUE} | ||
check_for_errors(v_fail, verbose = TRUE) | ||
check_for_errors(v_pass, verbose = TRUE) | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,137 @@ | ||
--- | ||
title: "Validating submissions locally" | ||
--- | ||
|
||
```{r, include = FALSE} | ||
knitr::opts_chunk$set( | ||
collapse = TRUE, | ||
comment = "#>" | ||
) | ||
``` | ||
|
||
```{r setup} | ||
library(hubValidations) | ||
``` | ||
|
||
While most hubs will have automated validation systems set up to check contributions during submission, `hubValidations` also provides functionality for validating files locally before submitting them. | ||
For this, submitting teams can use `validate_submission()` to validate their model output files prior to submitting. | ||
|
||
|
||
### Structure of `hub_validations` object | ||
|
||
|
||
Each named element contains the result of an individual check and inherits from subclass `<hub_check>`. The name of each element is the name of the check. | ||
|
||
```{r} | ||
hub_path <- system.file("testhubs/simple", package = "hubValidations") | ||
validate_submission(hub_path, | ||
file_path = "team1-goodmodel/2022-10-08-team1-goodmodel.csv" | ||
) | ||
``` | ||
|
||
|
||
The super class returned depends on the status of the check: | ||
|
||
- If a check succeeds, a `<message/check_success>` condition class object is returned. | ||
|
||
- If a check is skipped, a `<message/check_info>` condition class object is returned. | ||
|
||
- Checks vary with respect to whether they return an `<error/check_error>` or `<warning/check_failure>` condition class object if the check fails. | ||
Ultimately, both will cause overall validation to fail and the two classes are used primarily to communicate the severity of a failing check. | ||
|
||
### Validation early return | ||
|
||
Some checks which are critical to downstream checks will cause validation to stop and return the results of the checks up to and including the critical check that failed early. | ||
They generally return a `<error/check_error>` condition class object. | ||
Any problems identified will need to be resolved and the function rerun for validation to proceed further. | ||
|
||
|
||
```{r} | ||
validate_submission(hub_path, | ||
file_path = "team1-goodmodel/2022-10-15-hub-baseline.csv" | ||
) | ||
``` | ||
|
||
### Execution Errors | ||
|
||
If an execution error occurs in any of the checks, an `<error/check_exec_error>` is returned instead. For validation purposes, this results in the same downstream effects as an `<error/check_error>` object. | ||
|
||
|
||
### Checking for errors with `check_for_errors()` | ||
|
||
You can check whether your file will overall pass validation checks by passing the `hub_validations` object to `check_for_errors()`. | ||
|
||
If validation fails, an error will be thrown and the failing checks will be summarised. | ||
|
||
```{r, error=TRUE} | ||
validate_submission(hub_path, | ||
file_path = "team1-goodmodel/2022-10-08-team1-goodmodel.csv" | ||
) %>% | ||
check_for_errors() | ||
``` | ||
|
||
|
||
|
||
### Skipping the submission window check | ||
|
||
If you are preparing your submission prior to the submission window opening, you might want to skip the submission window check. | ||
You can so by setting argument `skip_submit_window_check` to `TRUE`. | ||
|
||
This results in the previous valid file (except for failing the validation window check) now passing overall validation. | ||
|
||
```{r} | ||
validate_submission(hub_path, | ||
file_path = "team1-goodmodel/2022-10-08-team1-goodmodel.csv", | ||
skip_submit_window_check = TRUE | ||
) %>% | ||
check_for_errors() | ||
``` | ||
|
||
|
||
|
||
## Structure of a `<hub_check>` object | ||
|
||
Let's look more closely at the structure of the first few elements of the `hub_validations` object retuned by `validate_submission()` | ||
|
||
```{r} | ||
v <- validate_submission(hub_path, | ||
file_path = "team1-goodmodel/2022-10-08-team1-goodmodel.csv" | ||
) | ||
str(head(v)) | ||
``` | ||
|
||
Each `<hub_check>` objects contains the following elements: | ||
|
||
- `message`: the result message containing details about the check. | ||
- `where:`: there the check was performed, usually the model output file name. | ||
- `call`: the function used to perform the check. | ||
- `use_cli_format`: whether the message is formatted using cli format, almost always TRUE. | ||
|
||
### Extra information | ||
|
||
Some `<hub_check>` objects contain extra information about the failing check to help identify affected rows in submissions. | ||
|
||
For example, the `<hub_check>` object returned for the `valid_vals` check, which checks that all columns in a model output file (excluding the `value` column) contain valid combinations of task ID / output type / output type ID values contains an additional element called `error_tbl`, with details of the invalid value combinations in the rows affected. | ||
|
||
To access `error_tbl` from the output of `validate_submission()` stored in an object `v`, you would use: | ||
|
||
```{r, eval=FALSE} | ||
v$valid_vals$error_tbl | ||
``` | ||
|
||
|
||
## `validate_submission` check details | ||
|
||
```{r, echo=FALSE} | ||
library(kableExtra) | ||
arrow::read_csv_arrow(system.file("check_table.csv", package = "hubValidations")) %>% | ||
dplyr::select(-"parent fun", -"check fun") %>% | ||
dplyr::mutate("Extra info" = dplyr::case_when( | ||
is.na(.data$`Extra info`) ~ "", | ||
TRUE ~ .data$`Extra info` | ||
)) %>% | ||
knitr::kable(caption = "Details of checks performed by `validate_submission()`") %>% | ||
kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive")) %>% | ||
column_spec(1, bold = TRUE) | ||
``` |