-
Notifications
You must be signed in to change notification settings - Fork 46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PSI-PROF COVID rnd 18 submission #278
PSI-PROF COVID rnd 18 submission #278
Conversation
Run validation on files: 2024-04-28-PSI-PROF.gz.parquet Columns:No errors or warnings found on the column names and numbers Scenarios:No errors or warnings found on scenario name and scenario id columns Origin Date Column:No errors or warnings found on the column 'origin_date' Value and Type Columns:🟡 Warning 5043: All values associated with output type 'sample' should have a maximum of 1 decimal place Target Columns:❌ Error 602: The data frame does not contain projections for 'inc death' target(s). Locations:No errors or warnings found on Location Sample:No errors or warnings found on Sample Quantiles:No errors or warnings found on quantiles values and format Age Group:No errors or warnings found on Age_group |
Hi @jturtle, Thank you for your submission, I just wanted to verify that I understand correctly the sample ID numbering: there is no stochasticity, every models runs has a different run_grouping "grouped" by age_group, horizon and scenario_id. Is that correct? Please let me know if any issues or questions, |
Hi Lucie,
At the risk of giving an overly complicated answer, I will say that there
is stochasticity in our results. However, each location-run_grouping pair
uses a unique set of parameters. So all horizon-age-scenario targets
associated with a location-run_grouping use a single parameter set, but
each individual value within that group experiences stochasticity. My
understanding of the stochastic_run column is that it is for when a
parameter set gets stochastically re-used for multiple trajectories—which
is not the case for our results. Hopefully this answer has been helpful,
but let me know if you would like further clarification.
Thank you,
Jamie
…On Thu, May 9, 2024 at 7:39 AM Lucie Contamin ***@***.***> wrote:
Hi @jturtle <https://github.com/jturtle>,
Thank you for your submission, I just wanted to verify that I understand
correctly the sample ID numbering: there is no stochasticity, every models
runs has a different run_grouping "grouped" by age_group, horizon and
scenario_id. Is that correct?
Also, I was wondering if it possible to round your value column to have a
maximum of 1 decimal place, if possible in your next update (it is not
necessary to fix it now).
Please let me know if any issues or questions,
Best, Lucie
—
Reply to this email directly, view it on GitHub
<#278 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACVOD26Q2AGIEG7DBUEE2S3ZBODD3AVCNFSM6AAAAABHN2OGQGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMBSG44TMNBSGI>
.
You are receiving this because you were mentioned.Message ID:
***@***.***
com>
|
Sorry, missed your second point. I rounded all states to one decimal
place, but then forgot to do it to the national rows. I have corrected the
submission code for future iterations.
…On Thu, May 9, 2024 at 8:36 AM James Turtle ***@***.***> wrote:
Hi Lucie,
At the risk of giving an overly complicated answer, I will say that there
is stochasticity in our results. However, each location-run_grouping pair
uses a unique set of parameters. So all horizon-age-scenario targets
associated with a location-run_grouping use a single parameter set, but
each individual value within that group experiences stochasticity. My
understanding of the stochastic_run column is that it is for when a
parameter set gets stochastically re-used for multiple trajectories—which
is not the case for our results. Hopefully this answer has been helpful,
but let me know if you would like further clarification.
Thank you,
Jamie
On Thu, May 9, 2024 at 7:39 AM Lucie Contamin ***@***.***>
wrote:
> Hi @jturtle <https://github.com/jturtle>,
>
> Thank you for your submission, I just wanted to verify that I understand
> correctly the sample ID numbering: there is no stochasticity, every models
> runs has a different run_grouping "grouped" by age_group, horizon and
> scenario_id. Is that correct?
> Also, I was wondering if it possible to round your value column to have a
> maximum of 1 decimal place, if possible in your next update (it is not
> necessary to fix it now).
>
> Please let me know if any issues or questions,
> Best, Lucie
>
> —
> Reply to this email directly, view it on GitHub
> <#278 (comment)>,
> or unsubscribe
> <https://github.com/notifications/unsubscribe-auth/ACVOD26Q2AGIEG7DBUEE2S3ZBODD3AVCNFSM6AAAAABHN2OGQGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMBSG44TMNBSGI>
> .
> You are receiving this because you were mentioned.Message ID:
> ***@***.***
> com>
>
|
Hi @jturtle , Thanks for the quick answer and the additional information. We are using all Please let me know if any issues or questions, |
Okay, I will change the code to assign stochastic run numbers. After
reviewing the sample_format.html document again, I am still not completely
sure what is meant by a stochastic "run". Here are three ways I can
interpret it:
1. A run is similar to a group. In our case a run consists of all
horizons, age, and scenarios associated with a single parameter set. We
run each group only once, so the group id would match the stochastic id.
2. A run is a model-run. In our case, a model run generates all horizons
and ages for a single scenario. We do six model runs with the same
parameter set (except the scenario axes) to generate a group of scenarios.
So we should assign six different stochastic ids for each group.
3. The stochasticity is independent of parameters, scenario, age, horizon,
and location, so each value of the submission is the result of an
independent run of the stochastic process. Therefore each row of the
submission should be assigned a unique stochastic_run id.
Sorry for the long reply. I know you are busy. I just want to be sure
that we do it correctly in the next iteration.
Thank you,
Jamie
…On Thu, May 9, 2024 at 11:19 AM Lucie Contamin ***@***.***> wrote:
Hi @jturtle <https://github.com/jturtle> ,
Thanks for the quick answer and the additional information. We are using
all 1 to imply that there is no stochasticity. So maybe to avoid
confusion, would it be possible to use a different number for each
stochastic run in your next update (it's ok for this version)? We know this
won't change the interpretation of the results, but we just want to be
consistent to avoid misinterpretation in the future.
Please let me know if any issues or questions,
Best, Lucie
—
Reply to this email directly, view it on GitHub
<#278 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACVOD23DR46FJ3BJCNAJ6TTZBO44JAVCNFSM6AAAAABHN2OGQGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMBTGE4DONRZHA>
.
You are receiving this because you were mentioned.Message ID:
***@***.***
com>
|
Good morning Jamie, Thank you for your questions, these columns can be difficult to fill in and you ask a very good question. So for example, if you have a run that consists of all of the horizons, age and scenario (your example (1)) associated with a single parameter set and sharing the same random seed. Then you can have both If you have a model run that generates all of the horizon and ages for a single scenario ((2)), and do 6 model runs with the same parameter but different stochasticity for each run, then each group can have a different stochastic ids. Just to add more information and avoid confusion, the difference between (1) and (2) comes down to whether you thinks individual trajectories for different scenarios are directly comparable (because of their stochasticity) The example (3): stochasticity is independent of parameters, will not be accepted in our validation system. It is expected that at least the horizion and age are grouped together. Also, just to add more context, this numbering system mostly depends on how you want us to interpret your results. If you think that some groups have the "same" stochasticity, they should match and if not, they shouldn't. Does that answer your question? Let me know if you need more information. Best, |
Hi Lucie,
Thank you very much for the clarifications. We were thinking about the
stochasticity as having no effect on the distribution of scenario A samples
relative to the distribution of scenario B samples. This is correct, but I
see now that it misses the point of the 'stochastic_run' column. We have
set seeds such that our stochastic ids will now match our group ids—for
when we submit an update.
Best,
Jamie
…On Fri, May 10, 2024 at 6:45 AM Lucie Contamin ***@***.***> wrote:
Good morning Jamie,
Thank you for your questions, these columns can be difficult to fill in
and you ask a very good question.
The "stochastic run" column is here to differentiate multiple stochastic
runs in your model run.
So for example, if you have a run that consists of all of the horizons,
age and scenario (your example (1)) associated with a single parameter set
and sharing the same random seed. Then you can have both run_grouping and
stochastic_run with the same numbering, with the group id matching the
stochastic id as you say.
If you have a model run that generates all of the horizon and ages for a
single scenario ((2)), and do 6 model runs with the same parameter but
different stochasticity for each run, then each group can have a different
stochastic ids.
Just to add more information and avoid confusion, the difference between
(1) and (2) comes down to whether you thinks individual trajectories for
different scenarios are directly comparable (because of their stochasticity)
The example (3): stochasticity is independent of parameters, will not be
accepted in our validation system. It is expected that at least the
horizion and age are grouped together.
Also, just to add more context, this numbering system mostly depends on
how you want us to interpret your results. If you think that some groups
have the "same" stochasticity, they should match and if not, they shouldn't.
Does that answer your question?
Let me know if you need more information.
I am always happy to answer any questions, and following this exchange, I
am also
happy to update the documentation accordingly, if necessary.
Best,
Lucie
—
Reply to this email directly, view it on GitHub
<#278 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACVOD22CYGTS24QM32Z5SVDZBTFPZAVCNFSM6AAAAABHN2OGQGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMBUGYZTSMBXGM>
.
You are receiving this because you were mentioned.Message ID:
***@***.***
com>
|
Hi Jamie, That sounds good, thank you very much! Best, Lucie |
Description
Initial submission to the round 18 scenario.
Notes to repo administrator
This submission contains only 'inc hosp'. This is intentional. We have a number of small details we would like to address—including adding 'inc death', but this update is at least two days away, and probably will come next week. These small changes will not qualitatively change how the scenarios relate to each other in our model. This submission passes all validation tests not related to the presence of 'inc death'.
If this is a new team submission, please include the following details :-
If you are adding new scenarios to an existing model, please include the following details:-
Checklist