You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Our expand grid functions are not aware that target_end_date value is dependent on the reference_date value. They treat the values of each task ID as independent and produce the following erroneous grid of valid value combinations where the values of target_end_date are clearly inconsistent with the horizon.
It can validate erroneous combinations of values (eg a row with reference_date: 2023-10-14, horizon: 2 and reference_date: 2023-10-21). This can be mitigated by the additional optional test that checks the values of target_end_date with respect to horizon and reference_date.
MORE IMPORTANTLY: It can cause erroneous failures in required values checks. For example, here's a subset of the missing values the check for required values erroneously returns.
What's going on is, some quantile values are being submitted for optional horizon 2 as well as required horizon 1. The horizon 2 values have different target_end_date values (2023-10-28 rather than 2023-10-21 for horizon 1). Note as well that all values in target_end_date are configured as optional. The check is therefore detecting that data for the optional target_end_date value 2023-10-28 and optional horizon value 2 has been supplied but not for optional target_end_date value 2023-10-28 and required horizon value 1. It is therefore throwing an error even though a target_end_date value 2023-10-28 and required horizon value 1 is invalid for a reference date of 2023-10-14.
Overall this is caused by the logic discussed in #17 and encapsulated by the required values only model out submission template and arises from us not being able to define relationships between values in different columns.
>hubUtils::create_model_out_submit_tmpl(con,
+round_id="2023-10-14",
+required_vals_only=TRUE,
+complete_cases_only=FALSE) |> dput()
#> ! Columns "target", "location", and "target_end_date" whose values are all optional included as all `NA`#> columns.#> ! Round contains more than one modeling task (2)#> ℹ See Hub's tasks.json file or <hub_connection> attribute "config_tasks" for details of optional task#> ID/output_type/output_type ID value combinations.#> # A tibble: 28 × 8#> reference_date target horizon location target_end_date output_type#> <date> <chr> <int> <chr> <date> <chr> #> 1 2023-10-14 <NA> NA <NA> NA pmf #> 2 2023-10-14 <NA> NA <NA> NA pmf #> 3 2023-10-14 <NA> NA <NA> NA pmf #> 4 2023-10-14 <NA> NA <NA> NA pmf #> 5 2023-10-14 <NA> NA <NA> NA pmf #> 6 2023-10-14 <NA> 1 <NA> NA quantile #> 7 2023-10-14 <NA> 1 <NA> NA quantile #> 8 2023-10-14 <NA> 1 <NA> NA quantile #> 9 2023-10-14 <NA> 1 <NA> NA quantile #> 10 2023-10-14 <NA> 1 <NA> NA quantile #> 11 2023-10-14 <NA> 1 <NA> NA quantile #> 12 2023-10-14 <NA> 1 <NA> NA quantile #> 13 2023-10-14 <NA> 1 <NA> NA quantile #> 14 2023-10-14 <NA> 1 <NA> NA quantile #> 15 2023-10-14 <NA> 1 <NA> NA quantile #> 16 2023-10-14 <NA> 1 <NA> NA quantile #> 17 2023-10-14 <NA> 1 <NA> NA quantile #> 18 2023-10-14 <NA> 1 <NA> NA quantile #> 19 2023-10-14 <NA> 1 <NA> NA quantile #> 20 2023-10-14 <NA> 1 <NA> NA quantile #> 21 2023-10-14 <NA> 1 <NA> NA quantile #> 22 2023-10-14 <NA> 1 <NA> NA quantile #> 23 2023-10-14 <NA> 1 <NA> NA quantile #> 24 2023-10-14 <NA> 1 <NA> NA quantile #> 25 2023-10-14 <NA> 1 <NA> NA quantile #> 26 2023-10-14 <NA> 1 <NA> NA quantile #> 27 2023-10-14 <NA> 1 <NA> NA quantile #> 28 2023-10-14 <NA> 1 <NA> NA quantile #> # ℹ 2 more variables: output_type_id <chr>, value <dbl>
This is a tricky issue and I'm not 100% sure how to proceed. The easiest way I can think of is to be able to ignore task IDs in certain situations like these. Will likely need some time to fix though as it likely needs work on complex function across hubUtils and hubValidations. Keen on hearing thoughts!
The text was updated successfully, but these errors were encountered:
The problem has been uncovered by trying to validate https://github.com/annakrystalli/FluSight-forecast-hub/blob/test-pr/model-output/UMass-trends_ensemble/2023-10-14-UMass-trends_ensemble.csv
Our expand grid functions are not aware that
target_end_date
value is dependent on thereference_date
value. They treat the values of each task ID as independent and produce the following erroneous grid of valid value combinations where the values oftarget_end_date
are clearly inconsistent with the horizon.Created on 2023-09-29 with reprex v2.0.2
This has 2 implications:
target_end_date
with respect tohorizon
andreference_date
.Created on 2023-09-29 with reprex v2.0.2
What's going on is, some quantile values are being submitted for optional
horizon
2 as well as requiredhorizon
1. The horizon 2 values have differenttarget_end_date
values (2023-10-28
rather than2023-10-21
for horizon 1). Note as well that all values intarget_end_date
are configured as optional. The check is therefore detecting that data for the optionaltarget_end_date
value2023-10-28
and optionalhorizon
value2
has been supplied but not for optionaltarget_end_date
value2023-10-28
and requiredhorizon
value1
. It is therefore throwing an error even though atarget_end_date
value2023-10-28
and requiredhorizon
value1
is invalid for a reference date of2023-10-14
.Overall this is caused by the logic discussed in #17 and encapsulated by the required values only model out submission template and arises from us not being able to define relationships between values in different columns.
Created on 2023-09-29 with reprex v2.0.2
This is a tricky issue and I'm not 100% sure how to proceed. The easiest way I can think of is to be able to ignore task IDs in certain situations like these. Will likely need some time to fix though as it likely needs work on complex function across
hubUtils
andhubValidations
. Keen on hearing thoughts!The text was updated successfully, but these errors were encountered: