Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

clean up tasks.json #1

Closed
elray1 opened this issue Nov 11, 2022 · 14 comments
Closed

clean up tasks.json #1

elray1 opened this issue Nov 11, 2022 · 14 comments
Assignees

Comments

@elray1
Copy link
Contributor

elray1 commented Nov 11, 2022

A couple of things I noticed about the draft tasks.json file we put together:

  1. Currently, quantile levels are listed here, under #/$defs/task_ids/quantile/quantile. This is not technically wrong in the sense that the reference can be resolved without issue. But it may be confusing to readers because quantiles are not a task id, but are rather a part of the output representation specification.
  2. We should add a boolean "round_id_from_variable": false flag
@LucieContamin LucieContamin self-assigned this Nov 16, 2022
@LucieContamin
Copy link
Contributor

LucieContamin commented Nov 16, 2022

  1. Very good point but I am not sure I understand the solution. I agree that task_id "section" is not the best place. Should I move it to the "output_types" section of the "defs" or do you suggest to completely avoid having it in the "defs" part of the JSON?
  2. Thanks for noticing it, I will add it in the next commit. However, I am not clear on how to implement it yet. Do we have some documentation already on that? Thanks!

@elray1
Copy link
Contributor Author

elray1 commented Nov 16, 2022

  1. How about if we move it to the "output_types" section of the "defs"? I like the idea of illustrating that references can still be used to specify quantile levels.

@LucieContamin
Copy link
Contributor

  1. I agree and like the idea too

@annakrystalli
Copy link
Member

Re: the $defs section. My understanding is that you don't even need to reflect the nested structure in the actual JSON. You could even have the array of valid quantiles under #defs/quantile/ if you wanted, unless of course what you are referencing (and want to replace by the reference) is nested itself. In that case you still reference the top level node of the structure you are referencing though. Having said that, more structure could add more traceability to where the definition is actually used but is not necessary. Agree that putting under task_ids/ though could prove confusing.

Regarding the round_id_from_variable property, good catch it is indeed missing! I feel it is a useful property that can guide certain type of checks so am in favour of including as required boolean as @elray1 you've suggested. I also agree that the simplest in the case where round_id_from_variable = true is for the variable to be the round_id value and completely do away with the round_id_variable property.

@annakrystalli
Copy link
Member

One question I had is, if a given round has round_id_from_variable = true but multiple sets of model_task specification, the round_id variable is shared by all sets and should therefore be present in all sets, correct?

@LucieContamin
Copy link
Contributor

As a given round should only have 1 id, the id is shared for all specification associated with this round. I think having the round_id_from_variable at the top level of the round specification will be enough, no?

@elray1
Copy link
Contributor Author

elray1 commented Nov 17, 2022

I think Anna's second comment here was more related to a forecast hub example where the round id corresponds to the value of a task id variable. I agree that in that kind of settings, the variable that is used as a round_id should appear in all of the task groups. I added a comment about this to this issue on the schemas repo.

@annakrystalli
Copy link
Member

Yes definitely. It's just that for a given model task, there can be multiple sets of valid task_id combinations.

E.g. in this example, there are two elements in model_tasks (groups of task_ids & output_types) that differ in the target variable & cdf specification: https://github.com/annakrystalli/hub-infrastructure-experiments/blob/bcb116d5f438f7ff217db74659c9d6bd5b4d274b/json-schema/modified-hubmeta-examples/complex-hubmeta-mod.json#L4-L110

I just wanted to double check that there is absolutely no situation when one of these groups does not contain the round_id variable as a task_id variable.

@elray1
Copy link
Contributor Author

elray1 commented Nov 17, 2022

In this case, the round_id is specified directly as "round_id": "round-1", so we will set "round_id_from_variable": false, there is no round_id variable, and there is no requirement for a round_id variable to appear in any of these groups, right?

@LucieContamin
Copy link
Contributor

In the case of one of the complex example (like the one @annakrystalli shared), I agree with @elray1 , we will have a round_id set to a specific value and "round_id_from_variable": false.

On the forecast hub example with "round_id_from_variable": true, I agree that in that settings we will need to have the associated round_id variable appearing in all of the tasks groups.

@annakrystalli
Copy link
Member

OK great! Regarding the "round_id_from_variable": true situation, there's no way to encode this requirement in the schema so it can be validated as part of initial json validation sadly but we can include the requirement in the docs as well as validate it in R after the general schema is validated.

@elray1
Copy link
Contributor Author

elray1 commented Nov 17, 2022

OK, so to wrap up the original question -- I think that what we want to do is add "round_id_from_variable": false in each round's definition, at the same level as the round_id specification.

@annakrystalli
Copy link
Member

Yup! It's what's currently implemented.

@LucieContamin
Copy link
Contributor

ok, sounds good. Thank you!

@elray1 elray1 closed this as completed Jan 5, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants