-
Notifications
You must be signed in to change notification settings - Fork 201
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove unnecessary calls to PyTensor eval() from user-facing methods #386
Conversation
@@ -564,7 +564,7 @@ def channel_contributions_forward_pass( | |||
excluded=[1, 2], | |||
signature="(m, n) -> (m, n)", | |||
) | |||
return target_transformed_vectorized(channel_contribution_forward_pass) | |||
return target_transformed_vectorized(channel_contribution_forward_pass).eval() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you need to put the eval inside, since target_transformed_vectorized
is pure numpy code
return target_transformed_vectorized(channel_contribution_forward_pass).eval() | |
return target_transformed_vectorized(channel_contribution_forward_pass.eval()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@juanitorduz can we use PyTensor functions for the inverse_transform
s? That would simplify this PR quite a lot. We could use pytensor.graph.vectorize
instead of numpy.vectorize
@@ -383,7 +383,7 @@ def channel_contributions_forward_pass( | |||
channel_contribution_forward_pass = ( | |||
beta_channel_posterior_expanded * logistic_saturation_posterior | |||
) | |||
return channel_contribution_forward_pass.eval() | |||
return channel_contribution_forward_pass |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The type signature has to be updated. However the subclass shouldn't change the meaning of the method (by returning a numpy value).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the meantime, should we call the base-class method "channel_contribution_forward_pass_untransformed", so that the sub-classes can call it without being ambiguous about the meaning?
This is only needed for as long as we can't return the "transformed" versions as PyTensor graphs.
Note to self: we decided that the only difference (for now) between the parent class version and child class version of |
Good point! You are right! 🙈 Naming things is hard 😅 |
b5036de
to
a6e8251
Compare
Ok. So this is at the point where I can either modify the tests to reflect the new return value/method name or try to convert the target transformation method to be a native pytensor function. The latter can obviously be a separate PR, but it seems like the tests would have to be rewritten again after converting the method to a pytensor function, so I figured it would make sense to ask for feedback. We had a bit of discussion about the adoption of sklearn in #154, but there was not definitive resolution. |
@cluhmann completely up to you, whether you want to do the extra work or leave it for later |
Personally, I am for sklearn especially with transformation of dataframe X in order to support new data. The transformation that are currently used carry a state (max value or mean, std of data) which we would have to mimic in some form even if we migrate away. I am working on a PR which shows how much sklearn can help reduce the transformation code we have to maintain while supporting the new data use-case. Yes, for just Are there concerns of it as a dependency? |
The issue is that we can't integrate sklearn stuff with other PyMC/PyTensor methods, e.g., the optimizer stuff that @cluhmann has been working on. It's a silly limitation since all we are using them for is to do some scale/max transforms which we can do just fine in PyTensor. |
My concern is for new data. The machine learning world keep track of the mean, std of training / fit data and use those for transformations of the leave out set. For instance, I would want to avoid cases like this: with pm.Model() as model:
y = pm.MutableData("y", [1, 2, 3])
y_normalized = y / y.max()
y_normalized.eval() # array([0.33333333, 0.66666667, 1. ])
with model:
pm.set_data({"y": [2, 3, 4]})
y_normalized.eval() # array([0.5 , 0.75, 1. ]) I few this as a data leakage |
That's not leakage, it's how the model was written. You use constants if you don't want them to change |
Obviously that is the intended behavior of pytensor. However, I am talking within the context of a MMM. if Making the transformations on constants like you said. I just claim using sklearn will be most managable way of doing that before putting transformed data into |
The issue is when doing optimization we want to have PyTensor expressions that depend on those transformations (so that we can auto-diff through them). At that point the sklearn becomes an issue more than it helps, because it is not composable with PyTensor expressions. Basically we want to write those sklearn transforms in PyTensor, so they cache the max/mean/std whatever of the original data as a constant and provide a back/forward method that can be called with new PyTensor expressions. |
To me the question is how pervasive we expect the sklearn components to be through the pymc-marketing codebase. Right now, it would be easy enough (famous last words) to replace the sklearn transformation (with all the robustness that @wd60622 notes) because the sklearn ingredients are not used all that often. But if we foresee lots of sklearn functionality being adopted within the project, then rewriting everything in pytensor becomes less appealing. But I don't have a good sense of what the future looks like in regard to sklearn usage. |
@cluhmann I agree, and my hunch is that eventually someone will want these operations to be part of the model itself and not just data pre-processing (perhaps downstream of MutableData). As soon as that happens you can no longer use sklearn or numpy routines. |
I am not attached to sklearn. However, I am for exchangeability of transformation methods. Having them all as mixins makes it almost very hard to change without having to rewrite much of the code, leaving room for errors, etc. I raised #407 to get away from mixins currently used for transformation. My draft PR has I just praise sklearn for the fit / fit_transform and transform separation which is important especially with parameters being estimated with artifacts of the "training" data (mean, std, max, etc). And since different transformation are happening to different subsets of DataFrame instances, the To illustrate the difficulty of the current code, I tried to use a |
I agree that the current models are very hard to customize, but I don't see how sklearn plays any role in that. |
I do not wave a strong opinion. Any volunteer(s) to draft a PR removing these transformers? We can all provide feedback 🙌 |
I have this PR up which removes the mixins for preprocessing and uses a single transformer (ColumnTransformer) to handle the transformations of new data. It addresses a lot but the original purpose was to support new data which need to be transformed only in the |
🙌 I'll take a look at it in detail once I'm back (🏖️🌴) |
Also a stale one. Can we close and reframe as some issues ? |
What do you mean? @juanitorduz said he would take a look (in 2023 😂 ). Yeah, I think we can close this. If it's still an issue, it will crop up (again) as we work on #358 |
OMG! 2023 was a thought year 😄🙈! Sorry 🥲 |
Fixes #383. The calls to
eval()
have been moved out of theBaseDelayedSaturatedMMM.channel_contributions_forward_pass()
method and to user-facing methods inDelayedSaturatedMMM
. It is not entirely clear to me whetherget_channel_contributions_forward_pass_grid()
is supposed to user-facing, but I assumed it is.What do we think about the possibility of more clearly distinguishing between the user-facing methods that return numpy/xarray objects and the internal methods that are working with tensors? The user-facing methods could then just be light wrappers converting/
eval()
ing the tensors to numpy/xarray arrays.📚 Documentation preview 📚: https://pymc-marketing--386.org.readthedocs.build/en/386/