Difference in BAR and MBAR uncertainties #438

mrshirts · 2018-01-12T07:08:33Z

mrshirts
Jan 12, 2018
Maintainer

Hi, all-

I've identified an interesting difference between BAR and MBAR uncertainty estimates as currently implemented in the code. If you start directly from Bennett's paper, plugging equation (8) into (7), you eventually get:

    #             <[f(-x)]^2>_0          <[f(x)]^2>_1            1     1                                                  
    #         [------------------]  + [---------------]    -    --- - ---                                                 
    #            n0 <f(-x)>_0^2          n1 <f(x)>_1^2           n0    n1

Where "_i" indicates averages over state 0 or 1, f(x) is the Fermi function, and x = M+(F1-F0)-(U1-U0), with M = ln n1/n0. This formula has been rederived a couple of times in different places, and is what is generally used for BAR -- and is what is used in the code currently. Call this method A.

If, however, you take my PRL BAR paper you get an estimator for the variance that can be reduced to:

    #                                           1                                         1     1                         
    #         --------------------------------------------------------------------    -  --- - ---                        
    #            n0 <f(-x)>_0 - n0 <[f(-x)]^2>_0 + n1 <f(x)>_1 + n1 <[f(x)]^2>_1          n0    n1

Call this method B. This approach also exactly what MBAR gives you (as proven in the MBAR paper). This the Fisher-information-derived uncertainty obtained from the likelihood.

A and B are not actually the same thing, though in the limit of good sampling, they are both practically indistinguishable from the true variance (obtained by repeating a bunch of times). I believe they are both "correct" in that they are both asymptotic estimators of the uncertainty, they just have somewhat different asymptotic behavior.

Which one is better?

Method A turns out to be a good estimate even if your free energy is wrong, which method B doesn't d So I was using method A since I was using MBAR for other BAR-based estimators (UBAR and RBAR in the Paliwal et al. 2011 paper). Recall that Bennett showed that his approach produced an estimator of the free energy even if you didn't use the right constant, but that the variance was minimized with the right constant (Which was the free energy + log ratio of number of samples). So A works regardless of whether the constant is right.
Method B appears to be slightly closer to the right answer for moderate to small overlap, but the difference is very small.
Method B is wrong if the constant is wrong (IIRC - this is something I call from a while ago, which is why I bothered to change it from the original method B to this one; maybe there was another error that was really the problem).
Method A is somewhat of an underestimate for very poor overlap and low numbers of samples.
Method B is an overestimate (sometimes wildly) for very poor overlap and low numbers of samples.
In situations they are noticeably from each other, they are also pretty different from the true answer (obtained by calculating the standard deviation over lots of realizations).

Examples of the divergence:

Test case, offset harmonic oscillators:

O_k = numpy.array([0, 1]) # offsets for spring constants
N_k = 1*numpy.array([10, 10]) # number of samples from each state
beta = 1.0 # inverse temperature for all simulations
K_k = spring constant for each state (varied).

Repeat 1000 times with new random data to generate true standard deviation. Repeating changes the answers slightly.

K's             True     Method A     Method B
[50, 32]        2.80       1.14        247.5
[25,16]         1.76       1.04         4.46
[12.5,8]        0.99       0.83         1.02
[6.25,4]        0.56       0.54         0.57
[2.5,1.6]       0.33       0.31         0.32

SO:

What to do?

In most cases, the results are pretty much indistinguishable, so operationally, not a big deal for any real application. I (or someone else? I couldn't find the issue discussing it) noticed a difference, and I investigated.

I have not yet been able to identify exactly where the differences come from in the derivations, but likely something subtle. One is standard error propagation and one is the Fisher information approach, and it's not clear why they are different to me yet.

mrshirts · 2018-01-13T23:46:12Z

mrshirts
Jan 13, 2018
Maintainer Author

Any questions/thoughts on this?

0 replies

jchodera · 2018-01-15T23:28:54Z

jchodera
Jan 15, 2018
Maintainer

Do we need to do anything here besides warn users these two asymptotic variance estimates are inequivalent and may lead to different estimates when overlap is poor?

If we want to help users apply MBAR more robustly, I think our focus is elsewhere:

For automatic equilibration detection, warn users when there are very long timescale correlations in the data such that only a small fraction of the available data is used in the estimate. This could be provided by a function that wraps detectEquilibration() and raises appropriate exceptions if we think something is wrong.
Automatically compute the overlap matrix for MBAR or BAR and warn when overlap is too poor to produce reliable estimates, or the number of available samples are too few. Again, we can provide a wrapped API that raises appropriate exceptions when things aren't behaving as they should.
Develop useful Bayesian extensions of MBAR that provide more robust uncertainty estimates in the small-sample limit. Even if the uncertainties are overestimates in this limit, that will still be useful in guiding adaptive sampling schemes.

0 replies

mrshirts · 2018-01-16T05:02:56Z

mrshirts
Jan 16, 2018
Maintainer Author

I agree this is a low priority thing; I am resurrecting my edits from this summer, and wanted to get confirmation before making a decision by fiat. I can document the difference in the code for now, and leave it at that.

point 1 above is straightforward; perhaps file an issue on the behavior you'd like to see?

point 3 is straightforward, it's more deciding on what the appropriate cutoffs should be and why.

point 3 is a research topic we've spent a lot of time on; the normalization constant itself is not amenable to Bayesian inference, since it is parameterless (there is only one set of normaliation constant that satisfied the equations and data). One proposal (that I am not sure has been done entirely correctly) introduces parameters in the density of states \sigma(E), but that gets a bit away from the pure probability interpretation, and would have to be thought about a fair amount before deciding if that really provided additional information, and would complicate things immensely if we have to be estimating parameters for many \sigma(E) if the energy function changes for each step.

Providing a tool for automatic bootstrapping might be better, and reporting errors as confidence intervals instead. Other non-point estimates of the uncertainty might be applicable. Right now, the one we are using is essentially the implicit function version of \delta f(x) = |df/dx| \delta x. In some sense, bootstrapping would be the "right" approach, since describes the uncertainty in the empirical distribution, which is what we are using to make the estimates.

0 replies

jchodera · 2018-01-16T05:21:31Z

jchodera
Jan 16, 2018
Maintainer

I like the bootstrapping idea.

Do you know if alchemlyb uses our implementation of MBAR? If so, it might be good to coordinate with them as well. And with the thermotools devs from Frank Noé's lab that provide MBAR and TRAM implementations.

I guess it would be useful to figure out what the most important services we can provide are in planning any future changes. Maybe this means we should also instrument the code to identify which features people use and what could use optimizing by having the code phone home by default.

These ideas are all out of scope for this issue, but how about we plan a brainstorming session (maybe in person?) where we can assess possibilities and effort commitments and plan the best way forward? We might be able to apply some funding toward this project.

0 replies

mrshirts · 2018-01-16T05:40:02Z

mrshirts
Jan 16, 2018
Maintainer Author

I like the bootstrapping idea.

Sound good. Should be pretty easy to implement.

Do you know if alchemlyb uses our implementation of MBAR? If so, it might be good to coordinate with them as well.

Yes, it does. I am working with them to port over alchemical_analysis functionality to alchemlyb, so I can manage this. We might think about what functionality should go there, but computing an uncertainty in multiple ways seems like a core functionality.

And with the thermotools devs from Frank Noé's lab that provide MBAR and TRAM implementations.

I'll let you manage that coordination.

I guess it would be useful to figure out what the most important services we can provide are in planning any future changes.

Yes.

Maybe this means we should also instrument the code to identify which features people use and what could use optimizing by having the code phone home by default.

Ugh. Could bring up a number of issues. Perhaps put that lower on the list . . .

These ideas are all out of scope for this issue, but how about we plan a brainstorming session (maybe in person?) where we can assess possibilities and effort commitments and plan the best way forward? We might be able to apply some funding toward this project.

Brainstorming is good. I would love to finally get the 'Everything you wanted to know about MBAR but were afraid to ask' review article finally out (likely as a LiveCOMS paper) this semester, so and improved version of pymbar (drawing on what is in #255) would go along well with that. I can devote some time this semester.

0 replies

mrshirts · 2018-02-22T03:03:21Z

mrshirts
Feb 22, 2018
Maintainer Author

Addressed by #286

0 replies

mrshirts · 2018-02-23T05:23:08Z

mrshirts
Feb 23, 2018
Maintainer Author

Will leave this here for now to remind that we need to figure out why the difference exists. What two approaches are being used and why are they different?

0 replies

mrshirts · 2022-03-13T17:21:41Z

mrshirts
Mar 13, 2022
Maintainer Author

See #430 for how this shows up when trying to figure out what the overlap matrix is, since overlap is directly related to uncertainty.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Difference in BAR and MBAR uncertainties #438

{{title}}

Replies: 8 comments

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

Select a reply

Difference in BAR and MBAR uncertainties #438

mrshirts Jan 12, 2018 Maintainer

Replies: 8 comments

mrshirts Jan 13, 2018 Maintainer Author

jchodera Jan 15, 2018 Maintainer

mrshirts Jan 16, 2018 Maintainer Author

jchodera Jan 16, 2018 Maintainer

mrshirts Jan 16, 2018 Maintainer Author

mrshirts Feb 22, 2018 Maintainer Author

mrshirts Feb 23, 2018 Maintainer Author

mrshirts Mar 13, 2022 Maintainer Author

mrshirts
Jan 12, 2018
Maintainer

mrshirts
Jan 13, 2018
Maintainer Author

jchodera
Jan 15, 2018
Maintainer

mrshirts
Jan 16, 2018
Maintainer Author

jchodera
Jan 16, 2018
Maintainer

mrshirts
Jan 16, 2018
Maintainer Author

mrshirts
Feb 22, 2018
Maintainer Author

mrshirts
Feb 23, 2018
Maintainer Author

mrshirts
Mar 13, 2022
Maintainer Author