Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bias correction of the MLE estimates of prevalence #7

Open
AngusMcLure opened this issue Jun 3, 2021 · 1 comment
Open

Bias correction of the MLE estimates of prevalence #7

AngusMcLure opened this issue Jun 3, 2021 · 1 comment
Labels
enhancement New feature or request

Comments

@AngusMcLure
Copy link
Owner

AngusMcLure commented Jun 3, 2021

The MLE is known to be biased for the pooled testing problem. Using Firth's correction seems to be a good way to correct this. The details of how to implement this for estimating a single prevalence (e.g. for use in PoolPrev) has been have been worked by Hepworth and Biggerstaff. Need to double check, but this should be equivalent to calculating the posterior mode with the Jeffrey's prior (which is already implemented for the Bayesian analysis).

Bias corrected estimates might also be desirable for the regression models. There is a discussion of bias here. One way to do this (at least for 'fixed effect' models) would be to use the brglm2 package. It seems that they might be able to deal with custom link functions. Based on the notes section of this vignette and this vignette it may just require some additional information to be appended to the link function (higher derivatives of the logliklihood function). There is also some published work on Firth correction in the group-testing regression here

Extending this mixed effects models might be more difficult...

Bias correction doesn't fit as nicely with the Bayesian paradigm. However, the simulation study in the Env Mod Soft paper did show that the posterior mean prevalence estimates were upwardly 'biased' by approx 5% when using the default priors (i.e. across a lot of simulated data where the true prevalence should be 1%, the mean of the predicted posterior means was about 1.05%). However the 'bias' got a little worse with lower prevalence. This suggests that some sort of correction might be sensible -- but a informative prior on the intercept (log odds of infection in the reference category) would be better if the information is available to have an informative prior. Also, the same simulation study showed that the 95% credible intervals 'covered' the true value about 95% of the time (again with non-informative priors). This suggests that bias is not such a big deal.

@AngusMcLure AngusMcLure added the enhancement New feature or request label Jun 3, 2021
@AngusMcLure
Copy link
Owner Author

Note this other package does Firth bias correction (though just for one sample prevalence estimates and two sample differences in prevalence) https://github.com/CDCgov/PooledInfRate

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant