Improve performance of emcee #781

peendebak · 2022-03-15T19:51:48Z

Description

This PR improves the performance of emcee by improving the internal _lnprob method.

The bounds argument is passed in transposed format. This aligns better (assuming standard numpy memory layout) with the operation in _lnprob performed with the bounds.
Move addition of the variable c out of the sum

All changes are internal, the public api is unchanged.

There are more options to improve performance, but these require more extensive changes or changes to the public api.

Type of Changes

Bug fix
New feature
Refactoring / maintenance
Documentation / examples

Tested on

Python: 3.9.9 (tags/v3.9.9:ccb0e6a, Nov 15 2021, 18:08:50) [MSC v.1929 64 bit (AMD64)]

lmfit: 1.0.2+29.gcd28293.dirty, scipy: 1.7.0, numpy: 1.21.5, asteval: 0.9.25, uncertainties: 3.1.5

Verification

Have you

included docstrings that follow PEP 257? Public api unchanged

referenced existing Issue and/or provided relevant link to mailing list? None found

verified that existing tests pass locally?

verified that the documentation builds locally?

squashed/minimized your commits and written descriptive commit messages?

added or updated existing tests to cover the changes? Not required
updated the documentation and/or added an entry to the release notes (doc/whatsnew.rst)?
added an example? Only performance improvements

newville · 2022-03-15T23:04:10Z

@peendebak Thanks, but also, really? Does that actually give a performance improvement?

Generally speaking, performance is not a primary concern for most of lmfit. And if you're in such an unfortunate situation as to be using emcee, I cannot understand why performance would be a concern. It is designed to be slow, purposefully ignoring known optimizations in order to demonstrate how hard it works.

Also, just to be clear: emcee does not do a fit. I have said this before (#601), but I would rather remove emcee completely from lmfit than pretend that it is useful or worth supporting.

peendebak · 2022-03-16T09:45:14Z

@newville I fully agree with you can the value of emcee does not lie in fitting. We fit our data using one of the other methods in lmfit, and then use emcee for further analysis (so the term fitting in the original description was a bit unfortunate).

When profiling emcee (and other lmfit methods) I noticed quite some time is spend in the framework itself, and not in the objective function that the user supplies. I understand that the focus for lmfit is on functionality (e.g. the high-level Parameter class) instead of speed. But for PR the changes are fully internal, so I don't see any reason not to use them.

newville · 2022-03-16T11:59:38Z

@peendebak this PR is a proposed change to the code with the explicit intent of improving performance. But you have not given any indication that performance is improved. Why not?

Making any changes in the use of emcee, especially saying that the change "improves performance/features", suggests that we support and endorse the use of emcee within lmfit.minimize. I think this code is misleading, not very useful, and should be removed.

reneeotten · 2022-03-16T13:31:06Z

@peendebak @newville I agree with Matt that some benchmarking of how much faster this is would indeed be useful. I find it hard to see why changing the order of the bounds would make a significant difference.

Having said that, whenever someone submits a PR or question/issue regarding emcee Matt's first response is "let's just remove it" as it isn't intended for fitting and is often misused for that. I do think it's a useful method, but indeed not for an initial fit as we all seem to agree on. We talked in the past about removing emcee as a fitting method and move it to a dedicated space for exploring the parameter space after a successful fit - I would be very much in favor of that but just haven't been able to find the time to do it and, additionally, I am not really an expert in emcee...

So how about instead of merging this PR we do actually that work now as we might have an interested party, @peendebak, in contributing to that goal? It would also allow us to change the api much more since we don't need to be backwards compatible (we could leave emcee as a method to minimize for one othee release) and try to solve some other open issues as well. For example, I think you might gain more from actually fixing the issue that emcee doesn't seem use multiprocessing - I did fiddle around with that a bit in the past and indeed verified that it isn't working, but couldn't figure out how to solve it.

So if you guys agree I will convert #601 from an Issue to a Discussion and we can continue there to decide how to organize this change and once that is settled writing the actual code should probably not to be too hard. Thoughts?

newville · 2022-03-16T15:28:25Z

@reneeotten I agree with all of that, including a further discussion and trying to make a plan to move use of emcee to be more clearly "post fit" analysis, similar to conf_interval().

peendebak · 2022-03-17T10:27:07Z

@newville @reneeotten I will close the PR, as there is no support for emcee it is current form as "fitting" algorithm.

I would be happy to use or review emcee functionality in different form, but will not have the resources to drive this new changes.

improve performance of emcee

76bfc36

peendebak force-pushed the performance/emcee_lnprob branch from ed5fd42 to 76bfc36 Compare March 15, 2022 19:57

newville mentioned this pull request Mar 15, 2022

fitting with emcee works poorly #601

Open

peendebak closed this Mar 17, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve performance of emcee #781

Improve performance of emcee #781

peendebak commented Mar 15, 2022 •

edited

Loading

newville commented Mar 15, 2022

peendebak commented Mar 16, 2022

newville commented Mar 16, 2022

reneeotten commented Mar 16, 2022

newville commented Mar 16, 2022

peendebak commented Mar 17, 2022

Improve performance of emcee #781

Improve performance of emcee #781

Conversation

peendebak commented Mar 15, 2022 • edited Loading

Description

Type of Changes

Tested on

Verification

newville commented Mar 15, 2022

peendebak commented Mar 16, 2022

newville commented Mar 16, 2022

reneeotten commented Mar 16, 2022

newville commented Mar 16, 2022

peendebak commented Mar 17, 2022

peendebak commented Mar 15, 2022 •

edited

Loading