This repository serves as a comprehensive exploration of the streaming entertainment landscape, driven by data analysis techniques. Through the utilization of Python's Beautiful Soup library, data is systematically extracted from JustWatch.com, a reputable streaming service aggregator. This extracted data, encompassing IMDb ratings, runtimes, genres, year of release and streaming service details, undergoes examination and visualization using Pandas, Matplotlib, and Seaborn within a Jupyter Notebook (.ipynb). The project aims to provide valuable insights into viewer preferences, discerning trends such as mean IMDb ratings, average runtimes for movies and shows, top genres, leading streaming services, and the highest-rated movies on Netflix. By delving into this analysis, viewers and content providers alike can gain a deeper understanding of the streaming landscape and make informed decisions accordingly.
- Python
- Beautiful Soup
- Pandas
- Matplotlib
- Seaborn
- Jupyter Notebook
WebScraping.ipynb
: Jupyter Notebook containing the code for web scraping, data analysis, and visualization.README.md
: This file, providing an overview of the project.
- Ensure Python is Installed: Make sure you have Python installed on your system.
- Install Required Libraries: Install the necessary libraries:
beautifulsoup4
,requests
,pandas
,matplotlib
, andseaborn
. - Clone the Repository: Clone this repository to your local machine.
- Open Jupyter Notebook: Open the
Web Scraping.ipynb
notebook using Jupyter Notebook. - Execute Code Cells: Follow the instructions within the notebook to execute the code cells and perform analysis.
- Explore Findings: Explore the findings and visualizations generated from the scraped data.
The data is scraped from JustWatch.com, a popular streaming service aggregator.
Note: Ensure you adhere to JustWatch's terms of service and data usage policies while scraping their website.
Feel free to customize and extend the analysis based on your requirements.
For any questions or feedback, please contact Altamash Ajaz at [email protected].