Skip to content

Truth Data Format

Paolo Milano edited this page Oct 24, 2024 · 12 revisions

The ground truth data for forecasting targets can be found in the folder target-data. To access the latest data file, use this link for ERVISS data. Alternatively, historical data files are stored in the snapshots folders and are named YYYY-MM-DD-hospital_admissions.csv, with YYYY-MM-DD representing the date of the last data update (which occurs every Friday). It's important to note that the latest file not only includes new data points but also the entire available history.

Each ground truth CSV file contains the following columns:

column column type description
target string The forecast target: "hospital admissions"
location string ISO-2 code identifying the country
truth_date date Date in format YYYY-MM-DD: the last day of the truth week (Sunday)
year_week string A string denoting the year and week to which the truth data corresponds
value decimal COVID19 hospital admissions

Below are illustrative rows as examples:

target,location,truth_date,year_week,value
hospital admissions,AT,2023-06-18,2023-W24,1
hospital admissions,AT,2023-06-11,2023-W23,2
hospital admissions,AT,2023-06-04,2023-W22,4
hospital admissions,AT,2023-05-28,2023-W21,3

From the first row, for instance, we can read that in Austria (AT), during week $47$ of the year $2023$, ending on Sunday, November 26, 2023, the reported COVID19 hospital admissions were approximately $2778$.

The countries are divided into the two data ground truth data sources as follows:

Data Source Countries (ISO-2 code)
ERVISS AT, BE, HR, CZ, DK, EE, FI, FR, GR, HU, IS, IE, IT, LV, LT, LU, MT, NL, NO, PL, PT, RO, SK, SI
FluID CH, GB-ENG, GB-WLS, GB-NIR, GB-SCT