This is Ranjani Amrapali Vishwanath and Yanqing Zhou 's final project for Data Science.
Python 3.10^
Clone the repository :
git clone [email protected]:wubba438/data_science_project.git
cd data_science_project
Create and activate a virtual envirnment :
python3 -m venv .my_venv
source .my_venv/bin/activate
or
python -m venv .my_venv
source .my_venv/bin/activate
Install dependencies : make sure you install everything documented in the requirements.txt file
python -m pip install -r requirements.txt
All of the code is located in the final_project_Ranjani_Yanqing.ipynb file. It does:
- Data loading
- Features extracting
- Data cleaning
- Clustering
- Heatmap
- Regression
The data could be found in the "raw_data" folder. The folder contains:
- The Github_clean2 copy.csv
- The "menry copy" folder, it contains multiple csv files corresponding to the information available in Github_clean2 copy.csv
The tables could be found in the "table" folder.
The figures could be found in the "figures" folder.
Here is the link to our report document https://docs.google.com/document/d/10pi9JYhXfycIpX-PQDmYVmoPK861trCUuoXumzRrbLo/edit?hl=fr#heading=h.e239kymargau
Here is the link to our report slides https://docs.google.com/presentation/d/1pl-2Yocco9QReSAj-XKMtwu0_AxgGnB6OJ8QrtyYlTo/edit#slide=id.g742e3e7cd_1_16