Student Name: Tuo Chen
Student Email: [email protected]
Through this project, I gained valuable experience in performing exploratory data analysis (EDA) on a real-world-like dataset. Specifically:
- Data Preprocessing: I strengthened my skills in cleaning and preparing data for analysis, such as handling date columns, deriving new features (e.g., profit margin), and managing missing values.
- Visualization Techniques: I learned to effectively use Plotly and Matplotlib to create interactive and static visualizations that uncover trends and patterns in the data.
- Insights from Data: The importance of understanding seasonal sales patterns, identifying underperforming products, and leveraging regional differences became clear.
- Streamlit for EDA: This project introduced me to Streamlit, which is excellent for creating interactive dashboards and making analysis accessible.
- Profitability Analysis: Calculating accurate profit margins and understanding the drivers of profitability for different products and regions was challenging.
- Handling Negative Growth: Interpreting negative sales growth trends in certain years and subcategories required careful contextual understanding.
- Streamlit Integration: Initially, integrating complex visualizations into the Streamlit app was time-consuming and required trial and error.
- While analyzing regional trends, the temporary sales surge in the South region (March 2011) seemed unusual, and I am unsure what external factors might have contributed.
- The interpretation of high annual average growth rates (AAGR) for certain products like supplies was unclear without additional context.
- Time Series Analysis: Deeper analysis of trends over time to forecast future sales performance.
- Profitability Drivers: Understanding factors influencing profitability in more detail.
- Advanced Visualization: Creating more sophisticated dashboards with Streamlit and Plotly.
- Overall Sales Trends: Investigate how total sales evolved over time and identify significant fluctuations or anomalies that may require attention.
- Profitability Analysis: Dive deeper into which products are more profitable, the factors influencing profitability, and how the company's overall profitability changed during the analyzed period.
--- Reflection Below This Line ---
Through this project, I gained valuable experience in data analysis using Python. I learned how to efficiently preprocess data by cleaning up unnecessary columns and creating new functions such as calculating total discounts and profit margins. By integrating libraries such as Pandas for data processing and Streamlit for interactive visualization, I was able to present meaningful insights about sales trends, customer behavior, and product performance. Additionally, I deepened my understanding of using APIs, such as the Kaggle API, to automatically retrieve datasets. Overall, this project strengthened my technical skills in Python programming, data visualization, and analytical thinking and deepened my understanding of sales data.