Skip to content

Big-data-course-CRI/project_github_2023

 
 

Repository files navigation

github project 2023-11-24

This is Ranjani Amrapali Vishwanath and Yanqing Zhou 's final project for Data Science.

Requirements

Python 3.10^

Installation

Clone the repository :

git clone [email protected]:wubba438/data_science_project.git
cd data_science_project

Create and activate a virtual envirnment :

python3 -m venv .my_venv
source .my_venv/bin/activate

or

python -m venv .my_venv
source .my_venv/bin/activate

Install dependencies : make sure you install everything documented in the requirements.txt file

python -m pip install -r requirements.txt

Files

All of the code is located in the final_project_Ranjani_Yanqing.ipynb file. It does:

  • Data loading
  • Features extracting
  • Data cleaning
  • Clustering
  • Heatmap
  • Regression

The data could be found in the "raw_data" folder. The folder contains:

  • The Github_clean2 copy.csv
  • The "menry copy" folder, it contains multiple csv files corresponding to the information available in Github_clean2 copy.csv

The tables could be found in the "table" folder.

The figures could be found in the "figures" folder.

Here is the link to our report document https://docs.google.com/document/d/10pi9JYhXfycIpX-PQDmYVmoPK861trCUuoXumzRrbLo/edit?hl=fr#heading=h.e239kymargau

Here is the link to our report slides https://docs.google.com/presentation/d/1pl-2Yocco9QReSAj-XKMtwu0_AxgGnB6OJ8QrtyYlTo/edit#slide=id.g742e3e7cd_1_16

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 86.2%
  • Jupyter Notebook 4.8%
  • Cython 4.2%
  • C++ 3.3%
  • C 1.4%
  • XSLT 0.1%