Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How many cases studies needed? #9

Open
seabbs opened this issue Jul 14, 2021 · 4 comments
Open

How many cases studies needed? #9

seabbs opened this issue Jul 14, 2021 · 4 comments
Assignees

Comments

@seabbs
Copy link
Contributor

seabbs commented Jul 14, 2021

Having a single case study (i.e Germany) greatly simplifies the work but may make the results less generalizable. Having 2 (i.e Germany and the UK) may improve matters. More may be better but obviously dramatically increases analysis degrees of freedom.

@jbracher
Copy link
Collaborator

In which sense does the addition of countries add analysis degrees of freedom? Because variant data for each country need to be pre-processed individually?

@seabbs
Copy link
Contributor Author

seabbs commented Sep 8, 2021

In the sense, we have more to summarise and draw inferences from + compute burden of course. Currently thinking using your data from Germany and then scraping data for 3 other countries with relatively late introductions (as earlier data doesn't seem to be available with a permissive license)

@seabbs seabbs transferred this issue from epiforecasts/forecast.vocs Sep 8, 2021
@seabbs
Copy link
Contributor Author

seabbs commented Sep 15, 2021

Currently, we have data covering quite a few countries included in the ECDC hub. See:

Could either try and use all of them (or at least those with semi-complete data) but this would potentially be quite a heavy computational burden or choose a subset.

If choosing a subset then which? Personally, interested in the UK but sadly historical data coverage is poor as covariants data only starts from the end of June (report date not sample taken date). Otherwise could try taking a representative sample or choose based on other criteria such as interest.

Whilst developing the analysis will likely pick a small subset just for speed of computation etc but can expand down the line as needed.

Other options:

  • Use a subset that have good historical data for the main retrospective analysis and another set (i.e including the UK) for data availability sensitivity analysis.
  • For countries with high Delta burdens early could use expert derived reporting schedules (i.e we guess) to approximate when we think data was available. Not that keen on this option.

@seabbs
Copy link
Contributor Author

seabbs commented Sep 17, 2021

Added four countries for now with the plan to consider all available with a sufficient case burden to make forecasting sensible.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants