Scale spend data to improve budget allocation efficacy #945

MobiusLooper · 2024-08-19T15:33:37Z

Description

This change adds scaling to the budget, bounds, and initial guess inside the BudgetOptimizer before handing the budget to the minimize function. It then unscales the data inside the objective function, and again unscales them when passing them back to the MMM at the end of the allocate_budget method. This is because the new changes from PR933 lead to potentially large values being handed to the SLSQP minimizer, and from what I understand, this optimisation is best suited to well scaled data. On my MMM, the larger scale of input (spend in the $100,000 region) meant the optimizer got stuck in a very flat part of the search space, and effectively regurgitated the equally weighted initial guess as optimal, which it was far from being.

I didn't add any tests because it would require instantiating a fitted MMM with actual parameters, and demonstrating that the output deviating from the initial guess, but given the presumably stochastic nature of the optimisation, that felt like not a very good test anyway. But, I could think of something if necessary.

Related Issue

Closes #
Related to Budget allocation not sum up to 1 #843

Checklist

Checked that the pre-commit linting/style checks pass
Included tests that prove the fix is effective or that the new feature works
Added necessary documentation (docstrings and/or example notebooks)
If you are a pro: each commit corresponds to a relevant logical change

Modules affected

MMM
CLV

Type of change

📚 Documentation preview 📚: https://pymc-marketing--945.org.readthedocs.build/en/945/

* add scaling into lift test method * test scaling methods * test change of MMM likelihood * link up beta_channel parameters * change name to sigma * reorganize and edit * closes pymc-labs#406 * address the feedback in docstrings * add more to docstring * format the docstring * be verbose for future devs * be explicit for the column max values * incorporate the feedback * hide cells and add to intro * add conclusion * switch to header 2 * run notebook through * move the types together * separate model estimated from empirical * fix typos

* drop python 3.9 * try python 3.12 * undo try python 3.12

* improve nb * rm warnings and add link to lifetimes quickstart * address comments * feedback part 3 * remove warnings manually

* add more info to the notebook * hide plots code * fix plot y labels * fix plot outputs and remove model build * improve final note probability plots * address comments * use quickstart dataset * feedback part 3 * remowe warnings manually * feedback part 4

* improve mmm docs init * add more code examples to docstrings * minor improvemeents * typo * better phrasing * add thomas suggestion

* move fixtures to conftest * docstrings and moved set_model_fit to conftest * fixed pandas quickstart warnings * revert to MockModel and add ParetoNBD support * quickstart edit for issue 609 * notebook edit

remove ruff E501 ignore

updates: - [github.com/astral-sh/ruff-pre-commit: v0.3.5 → v0.3.7](astral-sh/ruff-pre-commit@v0.3.5...v0.3.7) Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Juan Orduz <[email protected]>

* fix potential bugs * minor improvements * remove rule for nb * fix test * improve tests syntax * use stacklevel 2 for warnings * use stacklevel 1 for warnings as they are used in public methods * ignore B904 * undo change * ricardos feedback * use fit_posterior * Update pymc_marketing/clv/models/gamma_gamma.py Co-authored-by: Ricardo Vieira <[email protected]> * last fix XD --------- Co-authored-by: Ricardo Vieira <[email protected]>

* notebook opening and imports * model definition markdown * Data Load Notebook section * WIP model fitting section * added notebook to docs directory * notebook edits and graph code * ppc section and nb cleanup * demz sampling and WIP plotting * WIP predictive plots * WIP heatmap plots * predictive plots * WIP covariates and nbqa-ruff edits * covariate section * plot additions * fig sizes * remove model file

…book (pymc-labs#651) * add spaces, increase indentation, and fix number order * explicit with 6

* Creating plot waterfall Co-Authored-By: Carlos Trujillo <[email protected]> * requested changes * pre-commit --------- Co-authored-by: Carlos Trujillo <[email protected]>

Databricks should have a lower-case b.

updates: - [github.com/astral-sh/ruff-pre-commit: v0.4.1 → v0.4.2](astral-sh/ruff-pre-commit@v0.4.1...v0.4.2) - [github.com/pre-commit/mirrors-mypy: v1.9.0 → v1.10.0](pre-commit/mirrors-mypy@v1.9.0...v1.10.0) Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* support for negative values and dates (if used) * fix terrible spelling * test dates in coords * cover numpy objects * consolidate the tests

* Correcting typo num_days by horizon * Correcting typo num_days by horizon and scaler * Running notebooks * Update UML Diagrams * Rename horizon by periods * Adding test requested to check budget outputs * Running notebooks * Update UML Diagrams * Small notebook missing change. * Correction in tests * Change on name * running notebook modifying function * Update UML Diagrams

* helper command to view the artifacts from test * pass tune from kwargs * test for support of all samplers * add mlflow as a mock import * actual import as autolog is missing from docs

Co-authored-by: Juan Orduz <[email protected]>

updates: - [github.com/astral-sh/ruff-pre-commit: v0.5.7 → v0.6.1](astral-sh/ruff-pre-commit@v0.5.7...v0.6.1) Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

juanitorduz · 2024-08-19T19:18:43Z

Thank you for your contribution and context @MobiusLooper !

At the moment, we have just one test failing:

FAILED tests/mmm/test_delayed_saturated_mmm.py::TestDelayedSaturatedMMM::test_allocate_budget_to_maximize_response

From your remark

I didn't add any tests because it would require instantiating a fitted MMM with actual parameters, and demonstrating that the output deviating from the initial guess, but given the presumably stochastic nature of the optimisation, that felt like not a very good test anyway. But, I could think of something if necessary.

Feel free to modify (or even delete) old tests you think are not good enough. If you delete one test, please add one (or more)
to make sure the code is doing what you expect.

Feel free to ask for support in the tests :)

MobiusLooper · 2024-08-21T11:35:54Z

I realised I missed handling the (common) case of bounds being provided to the allocate_budget method, hence the update on scaling bounds

cetagostini · 2024-08-21T11:55:38Z

Hey @MobiusLooper, thanks for raise a PR. I'm not quite sure about the solution, some things worry me but could we think about them together.

a) You are scaling budgets based on the self.scales.max(), meaning, the budget get scale by the max across channels, but the channels scales are based within channels.

I think this is similar to the initial implementation and my personal opinions is it can bring some errors, also as complications in interpretation. The resulting number is not scaled back with the same value used during training, therefore it cannot be understood in the same way.

I left a small exercise where given a random allocation and specific scales per channel, using an allocation across channels can lead to an underestimation/overestimation (depending on the case) of the global budget and its response. Although directionally the response is the same, the amount of budget allocated to each one varies, and as the numbers are larger the variation is greater because the small decimals (on scale) represent larger units.

b) Even if there is a workaround for this, and we could decide that some scaling is not representative or that it might even be correct, is this really the solution? If the problem is that the optimizer is not exploring the 2D space optimally, we can expose the initial search values, the tolerance, increase the number of iterations, or adjust the bounds to be more informative than zero to inf, in order to guide it.

PS: We could even expose the method and you could switch to other more suitable if needed.

I could not find any reference in the official documentation from scipy about the optimizer working better inside a scaled space than outside a scaled space. If you have official information, it would be good to consider it.

I'm short on time this week, but it certainly this calls for discussion. Ideally, we'll validate this with examples. If we can build a couple of cases where we can see that this behavior is repetitive and not solvable by manipulating the internal parameters of the minimize, then we can think about it. I want to verify that this is not an isolated case, because I have been using it without seeing this behavior.

Example use case

### Using Global Scaler even when channels have an individual scale.

import numpy as np
import matplotlib.pyplot as plt

# Given parameters
lam_x1 = 0.2
beta_x1 = 1.2
lam_x2 = 0.7
beta_x2 = 2.1

scaler_x1 = 175 # e.g: original scale for x1 (MaxAbs for X1 based on model transformations)
scaler_x2 = 198 # e.g: original scale for x2 (MaxAbs for X2 based on model transformations)
scaler_target = 400 # e.g: Target scale (MaxAbs for target)

total_budget = 300 # Global budget to optimize
print("real total budget", total_budget)

split_x1 = 0.45  # 45% to x1 - e.g: Random example allocation decide by the scipy.minimize during the iteration process
split_x2 = 0.55  # 55% to x2 - e.g: Random example allocation decide by the scipy.minimize during the iteration process

# Scale the budget based on the max scaler between channels
max_scaler = max(scaler_x1, scaler_x2)

# Normalize the budget based on the global scale
normalized_budget_x1 = (total_budget * split_x1) / max_scaler
normalized_budget_x2 = (total_budget * split_x2) / max_scaler

# Total input budget to optimize normalize by global scaler.
print("normalized total budget", normalized_budget_x2+normalized_budget_x1)

# Total input budget to optimize in case to normalize each split by their respective existing scale (current implementation)
print("normalized budget on orignal scale", ((total_budget * split_x1) / scaler_x1) + ((total_budget * split_x2) / scaler_x2))

# Logistic saturation function
def logistic_saturation(x, lam):
    return (1 - np.exp(-lam * x)) / (1 + np.exp(-lam * x))

# Calculate the response for each channel
response_x1 = logistic_saturation(normalized_budget_x1, lam_x1) * beta_x1
response_x2 = logistic_saturation(normalized_budget_x2, lam_x2) * beta_x2

# Combine the responses (assuming additive effects for simplicity)
total_response = response_x1 + response_x2

# Rescale the response to the original target scale
rescaled_response = total_response * scaler_target

# Rescale the total budget (x1+x2) based on the original scales (Even when the split for those was doing at MaxAbs across channels and not within)
rescaled_total_budget = (normalized_budget_x2*scaler_x2) + (normalized_budget_x1*scaler_x1) # Total budget don't match

# Rescale the total budget (x1+x2) based on the global scales
rescaled_total_budget_global_scale = (normalized_budget_x2+normalized_budget_x1) * max_scaler # Total budget match - Doesn't mean the implementation is accurate.

print("Total Budget Original Scale", rescaled_total_budget)
print("Total Budget Original Global Scale", rescaled_total_budget_global_scale)

# Print the results
print("Normalized Budget X1:", normalized_budget_x1)
print("Normalized Budget X2:", normalized_budget_x2)
print("Response X1:", response_x1)
print("Response X2:", response_x2)
print("Total Response:", total_response)
print("Rescaled Total Response:", rescaled_response)

# Plot the saturation curves and the points
x_values = np.linspace(0, 1, 100)
y_values_x1 = logistic_saturation(x_values, lam_x1) * beta_x1
y_values_x2 = logistic_saturation(x_values, lam_x2) * beta_x2

plt.figure(figsize=(12, 6))
plt.plot(x_values, y_values_x1, label='Channel X1 Response Curve')
plt.plot(x_values, y_values_x2, label='Channel X2 Response Curve')
plt.scatter([normalized_budget_x1], [response_x1], color='red', label='X1 Budget Allocation', zorder=5)
plt.scatter([normalized_budget_x2], [response_x2], color='blue', label='X2 Budget Allocation', zorder=5)
plt.xlabel('Normalized Budget')
plt.ylabel('Response')
plt.title('Logistic Saturation Response Curves')
plt.legend()
plt.grid(True)
plt.show()

cc: @juanitorduz @wd60622

MobiusLooper · 2024-08-21T16:17:01Z

Thanks for the comment and thoughts @cetagostini!

My understanding of how this is doesn't cause the same issue as the initial implementation is that that implementation was treating the scale of local channel spend, and total budget, differently. This shouldn't have that issue, as everything is being scaled by the same number. I'm only about 50% sure I understood your concern on a). I think the fact it's a shared scaling factor sidesteps it, but let me know if I might be mistaken.

On b), I agree, this doesn't seem like the ideal way to solve this problem. Ideally, as you suggest, the solve kwargs could be better used to get to a better optimisation, but I've not come up with anything from my (limited) research down this avenue. I also can't find any literature about scale in optimisation outside of a stack overflow post, so certainly nothing authoritative. So I'll have to rely on examples! I've created a simplified example below that might help demonstrate the problem. And I added at the end a demonstration of the invariance under scaling I was attempting to explain in the first paragraph.

import numpy as np
from scipy.optimize import minimize

def hill(x, gamma, alpha):
    return x ** alpha / (gamma ** alpha + x ** alpha)

def objective(spends, normalise=False):
    spends = spends / scalers
    if normalise:
        spends = spends * max(scalers)
    return -sum(hill(spends[i], gammas[i], alphas[i]) for i in range(3))

def constraint(spends, total_budget):
    return total_budget - sum(spends)

unscaled_budget = 150000
scalers = [70000, 60000, 90000]
alphas = [1.2, 0.5, 0.6]
gammas = [0.7, 0.3, 0.65]

unscaled_initial_spends = np.full(3, unscaled_budget / 3)
unscaled_bounds = [(0, unscaled_budget) for _ in range(3)]

scaled_budget = unscaled_budget / max(scalers)
scaled_initial_spends = np.full(3, scaled_budget / 3)
scaled_bounds = [(0, scaled_budget) for _ in range(3)]

unscaled_result = minimize(
    objective, 
    unscaled_initial_spends, 
    args=(False),
    bounds=unscaled_bounds,
    constraints={'type': 'eq', 'fun': constraint, 'args': (unscaled_budget,)},
    method='SLSQP',
    tol=1e-9
)

scaled_result = minimize(
    objective, 
    scaled_initial_spends, 
    args=(True),
    bounds=scaled_bounds,
    constraints={'type': 'eq', 'fun': constraint, 'args': (scaled_budget,)},
    method='SLSQP', 
    tol=1e-9
)

# demonstrating better optimisation results
print(f"Optimal spends without scaling: {unscaled_result.x}, Objective: {-unscaled_result.fun:.3f}")
print(f"Optimal spends with scaling: {scaled_result.x * max(scalers)}, Objective: {-scaled_result.fun:.3f}\n")

# demonstrating that the scaling doesn't change the calculation of the objective function
print(f"Objective for initial guess without scaling: {objective(unscaled_initial_spends):.6f}")
print(f"Objective for initial guess with scaling: {objective(scaled_initial_spends, True):.6f}")

* Default saturation_from_dict to default_priors * Default to AdstockTransformation.default priors in adstock_from_dict

AlfredoJF · 2024-08-22T05:13:43Z

Hey @MobiusLooper

Thanks for sharing this example. Here is the output:

Optimal spends without scaling: [50000. 50000. 50000.], Objective: 1.608
Optimal spends with scaling: [76298.5413179  33295.11262302 40406.34605908], Objective: 1.651

Objective for initial guess without scaling: -1.607527
Objective for initial guess with scaling: -1.607527

I see your point however, the implementation is not correct because we have individual scalers by channel.

I modified your objective function to test individual scalers in the minimize function and I got the same result without scaling.

def objective(spends, normalise=False, original_scale=False):
    spends = spends / scalers
    if normalise:
        spends = spends * max(scalers)
    if original_scale:
        spends = spends * np.array(scalers)
    return -sum(hill(spends[i], gammas[i], alphas[i]) for i in range(3))
    
 ind_scaled_budget = unscaled_budget / np.array(scalers)
ind_scaled_initial_spends = np.full(3, ind_scaled_budget / 3)
ind_scaled_bounds = [(0, ind_scaled_budget[i]) for i in range(3)]

ind_scaled_result = minimize(
    objective, 
    ind_scaled_initial_spends, 
    args=(False, True),
    bounds=ind_scaled_bounds,
    constraints={'type': 'eq', 'fun': constraint, 'args': (ind_scaled_budget,)},
    method='SLSQP', 
    tol=1e-9
)

# demonstrating better optimisation results
print(f"Optimal spends with individual scaling: {ind_scaled_result.x * np.array(scalers)}, Objective: {-ind_scaled_result.fun:.3f}\n")

# demonstrating that the scaling doesn't change the calculation of the objective function
print(f"Objective for initial guess with individual scaling: {objective(ind_scaled_initial_spends, False, True):.6f}")

Result:


Optimal spends with individual scaling: [50000. 50000. 50000.], Objective: 1.608

Objective for initial guess with individual scaling: -1.607527

MobiusLooper · 2024-08-22T10:12:40Z

Thanks for the comment @AlfredoJF

It's definitely interesting that this individual scaling also enables well scaled inputs for the optimisation, and yet also doesn't actually optimise the allocations away from the initial spends.

When you say

not correct

can I just ask you to clarify what you mean? I see that the individual scaling isn't currently used in the BudgetOptimizer from what I can see.

AlfredoJF · 2024-08-23T05:35:57Z

Hey @MobiusLooper, sure so the main issue is this line spends = spends / scalers in your objective function

If you set the value normalise=True your spends var will be completely different from the original spend and it's like comparing apples to oranges. We want apples to apples

MobiusLooper · 2024-08-27T10:58:23Z

Isn't my implementation the same as what's currently in main?

My implementation:

spends = spends / scalers

main branch:

for idx, (_channel, params) in enumerate(self.parameters.items()):
            budget = budgets[idx] / self.scales[idx]

camlaird · 2024-10-31T15:33:45Z

Hey @AlfredoJF, we were wondering if you managed to further have a look at this?

wd60622 and others added 30 commits April 2, 2024 15:07

Drop python 3.9 support (pymc-labs#615)

8590b45

* drop python 3.9 * try python 3.12 * undo try python 3.12

add lift tests check

27a55d5

Add more content to the Gamma-Gamma Notebook (pymc-labs#573)

f3ec2cf

* improve nb * rm warnings and add link to lifetimes quickstart * address comments * feedback part 3 * remove warnings manually

Improve MMM Docs (pymc-labs#612)

da9f025

* improve mmm docs init * add more code examples to docstrings * minor improvemeents * typo * better phrasing * add thomas suggestion

Fix clv plotting bugs and edits to Quickstart (pymc-labs#601)

38b55a2

* move fixtures to conftest * docstrings and moved set_model_fit to conftest * fixed pandas quickstart warnings * revert to MockModel and add ParetoNBD support * quickstart edit for issue 609 * notebook edit

[pre-commit.ci] pre-commit autoupdate (pymc-labs#616)

ffef5a6

improve coords matching (pymc-labs#623)

a151330

python 3.12 attempt (pymc-labs#618)

6f33cef

closes pymc-labs#520 (pymc-labs#621) dataset -> data

c40c8b8

mmm example notebook small fixes (pymc-labs#626)

ff937f4

Remove ruff E501 ignore (pymc-labs#619)

e2dba27

remove ruff E501 ignore

Update pypi.yml (dash change) (pymc-labs#633)

b777eb5

UP rule ruff (pymc-labs#635)

2be2f63

Add ruff RUF rules (pymc-labs#636)

582d2ed

Update version.txt (pymc-labs#640)

78964f0

[pre-commit.ci] pre-commit autoupdate (pymc-labs#642)

18d926c

Fix build badge (pymc-labs#645)

ce41f5f

Add downloads stats to README

818781a

add spaces, increase indentation, and fix number order to Pareto note…

7034e28

…book (pymc-labs#651) * add spaces, increase indentation, and fix number order * explicit with 6

Add link to new Pareto notebook (pymc-labs#649)

d531d0c

Plot Waterfall Components Decomposition (pymc-labs#631)

1c186f0

* Creating plot waterfall Co-Authored-By: Carlos Trujillo <[email protected]> * requested changes * pre-commit --------- Co-authored-by: Carlos Trujillo <[email protected]>

Update resources.md (pymc-labs#652)

b36b0ac

Databricks should have a lower-case b.

fix ylabel (pymc-labs#654)

ae6ae1d

Lift test fixes (#656)

a89bd93

* support for negative values and dates (if used) * fix terrible spelling * test dates in coords * cover numpy objects * consolidate the tests

cetagostini and others added 5 commits August 14, 2024 15:38

Adding BLAS to the env generation.

c1b5933

Merge branch 'main' of https://github.com/pymc-labs/pymc-marketing

35c8850

Log number of posterior & tuning samples (pymc-labs#943)

8d116b5

* helper command to view the artifacts from test * pass tune from kwargs * test for support of all samplers * add mlflow as a mock import * actual import as autolog is missing from docs

point to GH discussions (pymc-labs#944)

5bbce01

Co-authored-by: Juan Orduz <[email protected]>

wd60622 requested a review from cetagostini August 19, 2024 16:59

juanitorduz added the MMM label Aug 19, 2024

[pre-commit.ci] pre-commit autoupdate (pymc-labs#946)

3126ae0

updates: - [github.com/astral-sh/ruff-pre-commit: v0.5.7 → v0.6.1](astral-sh/ruff-pre-commit@v0.5.7...v0.6.1) Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

cetagostini added the request discussion label Aug 19, 2024

MobiusLooper added 2 commits August 21, 2024 12:08

scale budget for minimize

a196a72

scale bounds

076dd2e

MobiusLooper force-pushed the budget-optimization-scaling branch from e724b93 to 076dd2e Compare August 21, 2024 11:09

cetagostini self-assigned this Aug 21, 2024

Fallback to defaults in adstock|saturation_from_dict (pymc-labs#955)

f6c6825

* Default saturation_from_dict to default_priors * Default to AdstockTransformation.default priors in adstock_from_dict

Add PyDocStyle Support (pymc-labs#951)

39d38b7

Merge branch 'main' into budget-optimization-scaling

3e7803d

ulfaslak force-pushed the main branch 4 times, most recently from f9a38fd to 818ba39 Compare September 8, 2024 12:38

twiecki force-pushed the main branch from 35dc7f1 to 9f9c67f Compare September 10, 2024 15:25

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Scale spend data to improve budget allocation efficacy #945

Scale spend data to improve budget allocation efficacy #945

MobiusLooper commented Aug 19, 2024 •

edited

Loading

juanitorduz commented Aug 19, 2024

MobiusLooper commented Aug 21, 2024 •

edited

Loading

cetagostini commented Aug 21, 2024 •

edited

Loading

MobiusLooper commented Aug 21, 2024

AlfredoJF commented Aug 22, 2024

MobiusLooper commented Aug 22, 2024 •

edited

Loading

AlfredoJF commented Aug 23, 2024

MobiusLooper commented Aug 27, 2024

camlaird commented Oct 31, 2024

Scale spend data to improve budget allocation efficacy #945

Are you sure you want to change the base?

Scale spend data to improve budget allocation efficacy #945

Conversation

MobiusLooper commented Aug 19, 2024 • edited Loading

Description

Related Issue

Checklist

Modules affected

Type of change

juanitorduz commented Aug 19, 2024

MobiusLooper commented Aug 21, 2024 • edited Loading

cetagostini commented Aug 21, 2024 • edited Loading

Example use case

MobiusLooper commented Aug 21, 2024

AlfredoJF commented Aug 22, 2024

MobiusLooper commented Aug 22, 2024 • edited Loading

AlfredoJF commented Aug 23, 2024

MobiusLooper commented Aug 27, 2024

camlaird commented Oct 31, 2024

MobiusLooper commented Aug 19, 2024 •

edited

Loading

MobiusLooper commented Aug 21, 2024 •

edited

Loading

cetagostini commented Aug 21, 2024 •

edited

Loading

MobiusLooper commented Aug 22, 2024 •

edited

Loading