The classical trade-off of an imbalanced dataset, and why the model with the highest accuracy doesn't always make the best model.
Our bank has been facing declining revenues lately. To combat this, they've initiated marketing campaigns to encourage more deposits.
They opted to develop a deep-learning model to forecast the campaign's results. This enables the marketing team to pinpoint a customer segment with high potential, allowing for targeted marketing efforts. Simultaneously, it minimizes ad spending on customers who are less likely to subscribe.
Accuracy at 0.91, sounds like a fantastic model, right?
Well, not so fast! The marketing manager has some concerns.
When customers see our campaign, they either subscribe to a term deposit or don't. This gives us four possible outcomes, with different weightage to the marketing team:
Class | Description |
---|---|
True Positive (TP) | Customers subscribed, just as we predicted! 😍 |
True Negative (TN) | Customers didn't subscribe, and we accurately anticipated it. This foresight optimizes our marketing budget 😎 |
False Positive (FP) | Oops! We expected these customers to subscribe, but they didn't. This misjudgment costs the bank 😬 |
False Negative (FN) | Oh no! These customers subscribed, but we missed it. BAD, Bank losing revenue and we're losing our job here 😱 |
Looking at the recall of the TP, it's only 0.37. This means that out of 1000 customers who sign up, our model mistakenly labels 630 of them as not interested! 😱
Suddenly, this model doesn't seem so great for our campaign.
After using SMOTE to resample our data, our accuracy took a hit, dropping to 0.84. But here's the silver lining: the recall for TP shot up to 0.91! 🥳
This means that out of 1000 customers who would subscribe, our model now catches 910 of them. The marketing team is ecstatic! They can now confidently use our model to segment customers and supercharge their sales funnel.