Skip to content

Submission format

Paolo Milano edited this page Jun 18, 2024 · 33 revisions

Projections should be stored as a parquet file in your model-output/team-model folder.

The parquet file must use a standardised file name, and contain specific variable names and values which identify the projections you are submitting. The automatic check validates both the filename and file contents to ensure the file is correct.

File name

Each projection file within the subdirectory should have the following name format:

<round_id>-<team>-<model>.parquet

The <round_id> is defined uniquely for each submission round and disease. It is composed by the season_cycle, identifying the season and the submission cycle, and the disease indicator. The team and model in this file name must match the name of the model-output directory this file is in (and correspond to the team_abbr and model_abbr parameters in the metadata file).

File format

Required variables

The parquet file must be contain only the following columns (in any order). No additional columns are allowed.

column column type description
round_id string The id of the submission round, e.g. '2020-2025_1_FLU', composed by the season cycle ('2024_2025_1') plus the disease ('FLU'). Will be defined for each round.
scenario_id string Id of the scenario as described in the round specifications (e.g. 'A', 'B', ...)
target string One of the targets defined/allowed for the round
location ISO2 string One of the ISO 3166-1 alpha-2 (ISO-2) geocodes for the European country. We provide a geocode file to convert between country names and ISO-2 codes or, if using R, you can use the countrycode package.
pop_group string The age bin, or another population breakdown identifier, as defined in the round specs
horizon integer Weeks ahead from the origin date(*) corresponding to the predicted value
target_end_date date string Target date corresponding to the projected value. Values must be a date in the format YYYY-MM-DD.
output_type String One of "quantile" or "sample"
output_type_id String/Integer When output_type = "sample" shall bean integer from 1 to 100 identifying the stochastic run for sample data. When output_type = "quantile", one of the 23 accepted quantiles, i.e. 0.010 0.025 0.050 0.100 0.150 0.200 0.250 0.300 0.350 0.400 0.450 0.500 0.550 0.600 0.650 0.700 0.750 0.800 0.850 0.900 0.950 0.975 0.990
value float The value of the prediction for the given target