Walmart is an American multinational retail corporation that operates a chain of supercenters, discount departmental stores, and grocery stores from the United States. Walmart has more than 100 million customers worldwide.
As a data analyst at Walmart my primary objective is to analyze customer purchase behavior during Black Friday sales to drive strategic decision-making and improve business performance.
I am tasked with investigating whether there are clear differences in spending habits between male and female customers.
My goal is to find out if women spend more than men during Black Friday at Walmart.
The dataset has the following features:
1.User_ID: User ID
2.Product_ID: Product ID
3.Gender: Sex of User
4.Age: Age in bins
5.Occupation: Occupation(Masked)
6.City_Category: Category of the City (A,B,C)
7.StayInCurrentCityYears: Number of years stay in current city
8.Marital_Status: Marital Status
9.ProductCategory: Product Category (Masked)
10.Purchase: Purchase Amount
Tracking the amount spent per transaction of all the 50 million female customers, and all the 50 million male customers, calculate the average, and conclude the results.
Inference after computing the average female and male expenses.
Use the sample average to find out an interval within which the population average will lie. Using the sample of female customers you will calculate the interval within which the average spending of 50 million male and female customers may lie.
Change the sample size to observe the distribution of the mean of the expenses by female and male customers.
The interval that you calculated is called Confidence Interval. The width of the interval is mostly decided by the business: Typically 90%, 95%, or 99%. Play around with the width parameter and report the observations.
check if the confidence intervals of average male and female spends are overlapping or not overlapping. How can Walmart leverage this conclusion to make changes or improvements?
Perform the same activity for Married vs Unmarried and Age.
For Age, you can try bins based on life stages: 0-17, 18-25, 26-35, 36-50, 51+ years.
• Perform EDA on the given dataset and find insights.
• Provide Useful Insights and Business recommendations that can help the business to grow.