Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to use a custom prior that contains a pm.Deterministic #857

Open
ivanistheone opened this issue Nov 21, 2024 · 5 comments
Open

Unable to use a custom prior that contains a pm.Deterministic #857

ivanistheone opened this issue Nov 21, 2024 · 5 comments

Comments

@ivanistheone
Copy link

Hello! I'm trying to reproduce the model from the Bayesian Estimation Supersedes the t-Test paper in Bambi.

I would like to set a "shifted exponential" for the degrees of freedom parameter $\nu \sim \text{Expon}\left(\lambda=\frac{1}{29}\right) + 1$, which is what we would get if the pm.Exponential accepted the loc=1 option like scipy.stats.expon.

I tried implementing the shifted exponential using a custom prior:

def ShiftedExponential(name, lam, *args, dims=None, **kwargs):
    exp = pm.Exponential.dist(lam=lam)
    return pm.Deterministic(name, exp+1, *args, dims=dims, **kwargs)

priors = {
    ...
    "nu": bmb.Prior("ShiftedExponential", lam=1/29, dist=ShiftedExponential),
}

the model builds and graphs fine

but when I run model.fit() I get the following error:
ValueError: Random variables detected in the logp graph: {MeasurableAdd.0, exponential_rv{"()->()"}.out}.
This can happen when DensityDist logp or Interval transform functions reference nonlocal variables,
or when not all rvs have a corresponding value variable.

with traceback

File [pymc/model/core.py:672](pymc/model/core.py#line=671), in Model.logp(self, vars, jacobian, sum)
    670 rv_logps: list[TensorVariable] = []
    671 if rvs:
--> 672     rv_logps = transformed_conditional_logp(
    673         rvs=rvs,
    674         rvs_to_values=self.rvs_to_values,
    675         rvs_to_transforms=self.rvs_to_transforms,
    676         jacobian=jacobian,
    677     )
    678     assert isinstance(rv_logps, list)
    680 # Replace random variables by their value variables in potential terms

File [pymc/logprob/basic.py#line=613), in transformed_conditional_logp(rvs, rvs_to_values, rvs_to_transforms, jacobian, **kwargs)
    612 rvs_in_logp_expressions = _find_unallowed_rvs_in_graph(logp_terms_list)
    613 if rvs_in_logp_expressions:
--> 614     raise ValueError(RVS_IN_JOINT_LOGP_GRAPH_MSG % rvs_in_logp_expressions)
    616 return logp_terms_list
ValueError: Random variables detected in the logp graph: {MeasurableAdd.0, exponential_rv{"()->()"}.out}.
This can happen when DensityDist logp or Interval transform functions reference nonlocal variables,
or when not all rvs have a corresponding value variable.*
Complete code example below the fold

# DATA
import pandas as pd
treated = [101, 100, 102, 104, 102,  97, 105, 105,  98, 101, 100, 123, 105,
           103, 100,  95, 102, 106, 109, 102,  82, 102, 100, 102, 102, 101,
           102, 102, 103, 103,  97,  97, 103, 101,  97, 104,  96, 103, 124,
           101, 101, 100, 101, 101, 104, 100, 101]
controls = [ 99, 101, 100, 101, 102, 100,  97, 101, 104, 101, 102, 102, 100,
            105,  88, 101, 100, 104, 100, 100, 100, 101, 102, 103,  97, 101,
            101, 100, 101,  99, 101, 100, 100, 101, 100,  99, 101, 100, 102,
            99, 100,  99]
groups = ["treatment"]*len(treated) + ["control"]*len(controls)
iqs = pd.DataFrame({"group": groups, "iq": treated + controls})


# MODEL
import bambi as bmb
import numpy as np
import pymc as pm

iqs_mean, iqs_std = iqs["iq"].mean(), iqs["iq"].std()

formula = bmb.Formula("iq ~ 0 + group", "sigma ~ 0 + group")

def ShiftedExponential(name, lam, *args, dims=None, **kwargs):
    exp = pm.Exponential.dist(lam=lam)
    return pm.Deterministic(name, exp+1, *args, dims=dims, **kwargs)

priors = {
    "group": bmb.Prior("Normal", mu=iqs_mean, sigma=1000*iqs_std),
    "sigma": {"group": bmb.Prior("Uniform", lower=np.log(iqs_std/1000), upper=np.log(iqs_std*1000))},
    "nu": bmb.Prior("ShiftedExponential", lam=1/29, dist=ShiftedExponential),
}

# Build model
model = bmb.Model(formula=formula,
                  family="t",
                  link="identity",
                  priors=priors,
                  data=iqs)

# Get model description
print(model)
#      Formula: iq ~ 0 + group
#               sigma ~ 0 + group
#       Family: t
#         Link: mu = identity
#               sigma = log
# Observations: 89
#       Priors: 
#   target = mu
#       Common-level effects
#           group ~ Normal(mu: 101.1798, sigma: 4744.7622)
#       Auxiliary parameters
#           nu ~ ShiftedExponential(lam: 0.0345)
#   target = sigma
#       Common-level effects
#           sigma_group ~ Uniform(lower: -5.3507, upper: 8.4648)

idata = model.fit()

Workaround using pm.Truncated

In this specific case, I found a workaround using pm.Truncated, since the exponential function is self-similar, so truncating $\text{Expon}\left(\lambda=\frac{1}{29}\right)$ it starting at 1, gives the same PDF as $\text{Expon}\left(\lambda=\frac{1}{29}\right) + 1$, as shown below:

def TruncatedExponential(name, lam, *args, dims=None, **kwargs):
    exp = pm.Exponential.dist(lam=lam)
    return pm.Truncated(name, exp, lower=1, *args, dims=dims, **kwargs)

priorst = {
    "group": bmb.Prior("Normal", mu=iqs_mean, sigma=1000*iqs_std),
    "sigma": {"group": bmb.Prior("Uniform", lower=np.log(iqs_std/1000), upper=np.log(iqs_std*1000))},
    "nu": bmb.Prior("TruncatedExponential", lam=1/29, dist=TruncatedExponential),
}

modelt = bmb.Model(formula=formula,
                  family="t",
                  link="identity",
                  priors=priorst,
                  data=iqs)

# works fine and reproduces the results form the paper 

I still wanted to flag Deterministic didn't work---or what is more likely---I'm not using it right.

See this notebook for complete code:
https://github.com/minireference/noBSstats/blob/main/notebooks/explorations/bambi_BEST_Deterministic.ipynb

numpy : 1.26.4
bambi : 0.14.1.dev12+g3a30784
pymc  : 5.18.0
pandas: 2.2.2
arviz : 0.19.0
@GStechschulte
Copy link
Collaborator

Hey @ivanistheone thanks for the detailed issue. I think this is the expected behavior as deterministic variables are components in probabilistic programming that represent values that are completely determined by their input variables.

A deterministic prior cannot be sampled directly as these deterministic variables are computed based on their parent variables rather than being sampled like traditional random variables.

@tomicapretto correct me if I am wrong here.

@ivanistheone
Copy link
Author

Right. From reading other code examples (PyMC) I see them always used as extra computed variables, but never to specify priors.

Is there another way to produce a custom prior of the form $\nu \sim \text{Expon}\left(\lambda\right) + 1$ ?

Here is another thing I tried based on pm.CustomDist:

def ShiftedExpCustomDist(name, lam,  **kwargs):
    def shited_exp(lam, loc, size):
        return pm.Exponential.dist(lam=lam) + loc
    return pm.CustomDist(name, lam, 1, dist=shited_exp, **kwargs)

priorscd = {
    ...
    "nu": bmb.Prior("ShiftedExpCustomDist", lam=1/29, dist=ShiftedExpCustomDist),
}

which seems to work, though it produced a lot of divergent transitions, so its probably not the right way to do this.

@GStechschulte
Copy link
Collaborator

GStechschulte commented Nov 25, 2024

Yeah, this code follows a lot more closely with what the PyMC docs recommend. See the third example here for an example of a shifted exponential distribution.

I you pass -1 to loc, it looks like the divergences go away and you get the same results as the paper? Here's the code I was using

import pymc as pm
from pytensor.tensor import TensorVariable

def dist(
    lam: TensorVariable,
    shift: TensorVariable,
    size: TensorVariable,
) -> TensorVariable:
    return pm.Exponential.dist(lam, size=size) + shift

iqs_mean, iqs_std = iqs["iq"].mean(), iqs["iq"].std()

formula = bmb.Formula("iq ~ 0 + group", "sigma ~ 0 + group")

priors = {
    "group": bmb.Prior("Normal", mu=iqs_mean, sigma=1000*iqs_std),
    "sigma": {"group": bmb.Prior("Uniform", lower=np.log(iqs_std/1000), upper=np.log(iqs_std*1000))},
    "nu": bmb.Prior(
    "ShiftedExponential",
    lam=1/29,
    shift=-1.0,
    dist=lambda name, lam, shift: pm.CustomDist(
        name,
        lam,
        shift,
        dist=dist,
        signature="(),()->()"
        )
    )
}

@ivanistheone
Copy link
Author

I tried your code and it produces a prior shifted by 1 unit in the wrong direction:

Screenshot 2024-11-25 at 1 38 16 PM

BTW, the using the un-shifted prior bmb.Prior("Exponential", lam=1/29) for nu also works fine (no divergences).

I've decided to use the un-shifted prior since it seems to produce exactly the same results as in the shifted case (cf. pm.Truncated workaround in the folded example).

Feel free to close this issue, since I think we're talking about an edge case here, that might not be super important... except perhaps if we can figure out a nice example of pm.CustomDist prior to add to the docs.

@GStechschulte
Copy link
Collaborator

I tried your code and it produces a prior shifted by 1 unit in the wrong direction

That makes sense since I passed -1 . 😆

I will keep the issue open and add a docs tag to remember to add an example on using pm.CustomDist for custom priors. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants