Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove unnecessary calls to PyTensor eval() from user-facing methods #386

Closed
wants to merge 4 commits into from

Conversation

cluhmann
Copy link
Contributor

@cluhmann cluhmann commented Oct 1, 2023

Fixes #383. The calls to eval() have been moved out of the BaseDelayedSaturatedMMM.channel_contributions_forward_pass() method and to user-facing methods in DelayedSaturatedMMM . It is not entirely clear to me whether get_channel_contributions_forward_pass_grid() is supposed to user-facing, but I assumed it is.

What do we think about the possibility of more clearly distinguishing between the user-facing methods that return numpy/xarray objects and the internal methods that are working with tensors? The user-facing methods could then just be light wrappers converting/eval()ing the tensors to numpy/xarray arrays.


📚 Documentation preview 📚: https://pymc-marketing--386.org.readthedocs.build/en/386/

@cluhmann cluhmann changed the title Remove unnecessarily calls to PyTensor eval() from user-facing methods Remove unnecessary calls to PyTensor eval() from user-facing methods Oct 1, 2023
@@ -564,7 +564,7 @@ def channel_contributions_forward_pass(
excluded=[1, 2],
signature="(m, n) -> (m, n)",
)
return target_transformed_vectorized(channel_contribution_forward_pass)
return target_transformed_vectorized(channel_contribution_forward_pass).eval()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you need to put the eval inside, since target_transformed_vectorized is pure numpy code

Suggested change
return target_transformed_vectorized(channel_contribution_forward_pass).eval()
return target_transformed_vectorized(channel_contribution_forward_pass.eval())

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@juanitorduz can we use PyTensor functions for the inverse_transforms? That would simplify this PR quite a lot. We could use pytensor.graph.vectorize instead of numpy.vectorize

@@ -383,7 +383,7 @@ def channel_contributions_forward_pass(
channel_contribution_forward_pass = (
beta_channel_posterior_expanded * logistic_saturation_posterior
)
return channel_contribution_forward_pass.eval()
return channel_contribution_forward_pass
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The type signature has to be updated. However the subclass shouldn't change the meaning of the method (by returning a numpy value).

Copy link
Contributor

@ricardoV94 ricardoV94 Oct 2, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the meantime, should we call the base-class method "channel_contribution_forward_pass_untransformed", so that the sub-classes can call it without being ambiguous about the meaning?

This is only needed for as long as we can't return the "transformed" versions as PyTensor graphs.

@cluhmann
Copy link
Contributor Author

Note to self: we decided that the only difference (for now) between the parent class version and child class version of channel_contributions_forward_pass is the transformation. So naming the parent class method something like channel_contribution_forward_pass_untransformed() should be ok.

@juanitorduz
Copy link
Collaborator

Note to self: we decided that the only difference (for now) between the parent class version and child class version of channel_contributions_forward_pass is the transformation. So naming the parent class method something like channel_contribution_forward_pass_untransformed() should be ok.

Good point! You are right! 🙈 Naming things is hard 😅

@cluhmann
Copy link
Contributor Author

Ok. So this is at the point where I can either modify the tests to reflect the new return value/method name or try to convert the target transformation method to be a native pytensor function. The latter can obviously be a separate PR, but it seems like the tests would have to be rewritten again after converting the method to a pytensor function, so I figured it would make sense to ask for feedback.

We had a bit of discussion about the adoption of sklearn in #154, but there was not definitive resolution.

@ricardoV94
Copy link
Contributor

@cluhmann completely up to you, whether you want to do the extra work or leave it for later

@wd60622
Copy link
Contributor

wd60622 commented Oct 23, 2023

Personally, I am for sklearn especially with transformation of dataframe X in order to support new data. The transformation that are currently used carry a state (max value or mean, std of data) which we would have to mimic in some form even if we migrate away.
The fit, fit_transform, transform API is become a common place but sklearn functionality for transformation on certain dataframe columns, preserving index, and other pandas support which has caused some of the recent bugs (reset_index etc).

I am working on a PR which shows how much sklearn can help reduce the transformation code we have to maintain while supporting the new data use-case.

Yes, for just y it seems a bit overkill but other locations, I think it is beyond helpful.

Are there concerns of it as a dependency?

@ricardoV94
Copy link
Contributor

ricardoV94 commented Oct 23, 2023

Are there concerns of it as a dependency?

The issue is that we can't integrate sklearn stuff with other PyMC/PyTensor methods, e.g., the optimizer stuff that @cluhmann has been working on. It's a silly limitation since all we are using them for is to do some scale/max transforms which we can do just fine in PyTensor.

@wd60622
Copy link
Contributor

wd60622 commented Oct 23, 2023

Are there concerns of it as a dependency?

The issue is that we can't integrate sklearn stuff with other PyMC/PyTensor methods, e.g., the optimizer stuff that @cluhmann has been working on. It's a silly limitation since all we are using them for is to do some scale/max transforms which we can do just fine in PyTensor.

My concern is for new data. The machine learning world keep track of the mean, std of training / fit data and use those for transformations of the leave out set.

For instance, I would want to avoid cases like this:

with pm.Model() as model: 
    y = pm.MutableData("y", [1, 2, 3])
    y_normalized = y / y.max()

y_normalized.eval() # array([0.33333333, 0.66666667, 1.        ])

with model: 
    pm.set_data({"y": [2, 3, 4]})
    
y_normalized.eval() # array([0.5 , 0.75, 1.  ])

I few this as a data leakage

@ricardoV94
Copy link
Contributor

I few this as a data leakage

That's not leakage, it's how the model was written. You use constants if you don't want them to change

@wd60622
Copy link
Contributor

wd60622 commented Oct 23, 2023

Obviously that is the intended behavior of pytensor. However, I am talking within the context of a MMM. if y was cost then the max cost that is used to determine the model parameters is based on the data fit, not the data in a holdout set.

Making the transformations on constants like you said. I just claim using sklearn will be most managable way of doing that before putting transformed data into MutableData containers

@ricardoV94
Copy link
Contributor

ricardoV94 commented Oct 24, 2023

Making the transformations on constants like you said. I just claim using sklearn will be most managable way of doing that before putting transformed data into MutableData containers

The issue is when doing optimization we want to have PyTensor expressions that depend on those transformations (so that we can auto-diff through them). At that point the sklearn becomes an issue more than it helps, because it is not composable with PyTensor expressions. Basically we want to write those sklearn transforms in PyTensor, so they cache the max/mean/std whatever of the original data as a constant and provide a back/forward method that can be called with new PyTensor expressions.

@cluhmann
Copy link
Contributor Author

To me the question is how pervasive we expect the sklearn components to be through the pymc-marketing codebase. Right now, it would be easy enough (famous last words) to replace the sklearn transformation (with all the robustness that @wd60622 notes) because the sklearn ingredients are not used all that often. But if we foresee lots of sklearn functionality being adopted within the project, then rewriting everything in pytensor becomes less appealing. But I don't have a good sense of what the future looks like in regard to sklearn usage.

@ricardoV94
Copy link
Contributor

@cluhmann I agree, and my hunch is that eventually someone will want these operations to be part of the model itself and not just data pre-processing (perhaps downstream of MutableData). As soon as that happens you can no longer use sklearn or numpy routines.

@wd60622
Copy link
Contributor

wd60622 commented Oct 26, 2023

I am not attached to sklearn. However, I am for exchangeability of transformation methods. Having them all as mixins makes it almost very hard to change without having to rewrite much of the code, leaving room for errors, etc.

I raised #407 to get away from mixins currently used for transformation. My draft PR has x_transformer which mirrors the target_transformer attribute. Doing so, provides the forward pass on new data X via a single transform call which was incorrectly implemented (#405) but is open (with some restriction) to changing the transformation happening.

I just praise sklearn for the fit / fit_transform and transform separation which is important especially with parameters being estimated with artifacts of the "training" data (mean, std, max, etc). And since different transformation are happening to different subsets of DataFrame instances, the ColumnTransformer is made just for that.
I am all for having transformation on the data in pytensor if these details aren't missed.

To illustrate the difficulty of the current code, I tried to use a MaxAbsScaler for control columns as well. But to do so requires overriding mixins or completely redefining the class and copying much of the code.

@ricardoV94
Copy link
Contributor

ricardoV94 commented Oct 26, 2023

I agree that the current models are very hard to customize, but I don't see how sklearn plays any role in that.

@juanitorduz
Copy link
Collaborator

I do not wave a strong opinion. Any volunteer(s) to draft a PR removing these transformers? We can all provide feedback 🙌

@wd60622
Copy link
Contributor

wd60622 commented Oct 26, 2023

I have this PR up which removes the mixins for preprocessing and uses a single transformer (ColumnTransformer) to handle the transformations of new data.

It addresses a lot but the original purpose was to support new data which need to be transformed only in the _data_setter

@juanitorduz
Copy link
Collaborator

🙌 I'll take a look at it in detail once I'm back (🏖️🌴)

@wd60622
Copy link
Contributor

wd60622 commented Jul 25, 2024

Also a stale one. Can we close and reframe as some issues ?

@cluhmann
Copy link
Contributor Author

What do you mean? @juanitorduz said he would take a look (in 2023 😂 ). Yeah, I think we can close this. If it's still an issue, it will crop up (again) as we work on #358

@juanitorduz
Copy link
Collaborator

juanitorduz commented Jul 25, 2024

OMG! 2023 was a thought year 😄🙈! Sorry 🥲

@wd60622 wd60622 closed this Jul 30, 2024
@twiecki twiecki deleted the eval_removal branch September 11, 2024 07:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Avoid PyTensor evals whenever possible
4 participants