Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Major inference rewrite #90

Closed
jcblemai opened this issue Oct 26, 2023 · 2 comments
Closed

Major inference rewrite #90

jcblemai opened this issue Oct 26, 2023 · 2 comments
Labels
enhancement Request for improvement or addition of new feature(s). medium priority Medium priority. r-inference Relating to the R inference package.

Comments

@jcblemai
Copy link
Collaborator

jcblemai commented Oct 26, 2023

Many of the current issues concern inference (#87 #86 #84 #85, ...)

At the risk of delaying the solving, wanted to start some discussion about rewriting inference with the current gempyor object structure.

Major benefits:

  • one-click installation once you have Python installed > much easier setup, easy to use GUI on Windows, a .exe installer, ... ... Currently our team spends a lot of time trying to install flepiMoP or trying to run it on the server because of the python/R co-dependence. It massively hinders our distribution.
  • Simpler architecture and maintenance: currently each basic functionality is written in Python (once) and in multiple R files, which makes it very hard to change anything on inference (see the current attempts to fit initial conditions, not finished, to fit without NPI, also not finished, and the test failing). Inference in Python would use one unified layer to the filesystem already defined in gempyor (gempyor objects do not write parquet themselves, but just throw what they want to write at a function that handles extension, subpop, filename, in all cases). It would be easier to change our baseline implementation.
  • Building a better inference architecture: this requires some thought and is not exclusive to a rewrite in Python, but we could inspire ourselves from standard inference packages in architecture and interact with the current ecosystem in Python. We could get diagnosis plots and criteria with Arviz, and sampler perturbation tuning from EMCEE which makes sampling more efficient in high-dimensional spaces. Benefit from parameters as an object as well, for plotting and command line.

Minor benefit

  • Performance: good integration with gempyor allows just rerunning some parts of the model, selecting outcomes that are needed to fit and not writing intermediate simulations to disk (a big overhead on docker or on scracht4)
  • Python HPC ecosystem: for cloud or local compute, python has some great high performance computing and big data library we could use. It's minor because R has a lot of strong points as well.

Drawback

  • it's quite some work. I'd estimate 2 weeks for feature parity
  • part of our team is fluent in R and that would reduce their ability to change inference (but inference was also changed by non-R folks recently and we have now more python devs (Koji, Pengcheng)
@jcblemai jcblemai added enhancement Request for improvement or addition of new feature(s). medium priority Medium priority. r-inference Relating to the R inference package. labels Oct 26, 2023
@alsnhll
Copy link
Collaborator

alsnhll commented Oct 30, 2023

Happy to help with this

@TimothyWillard
Copy link
Contributor

@jcblemai @alsnhll I think this can be closed as resolved by GH-203. Further improvements can be described in new targeted issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Request for improvement or addition of new feature(s). medium priority Medium priority. r-inference Relating to the R inference package.
Projects
None yet
Development

No branches or pull requests

3 participants