GitHub - shyamala-venkat/ML_Project_Student_Performance_Prediction

ML Project - Student Performance Prediction

The goal is to predict the math score of students (Regression Analysis).

There are 7 independent variables:

gender - Male/ Female
race/ethnicity - Group division from A to E
parental level of education - Details of parental education varying from high school to master's degree
lunch - Type of lunch selected
test preparation course - Course details
reading score - Marks secured by a student in Reading
writing score - Marks secured by a student in Writing

Target variable:

math score - Marks secured by a student in Mathematics

Dataset Source Link : https://www.kaggle.com/datasets/spscientist/students-performance-in-exams/data

Steps followed in the project:

Data Ingestion :

The input data is read as csv file and the data is split into training and testing sets and saved as csv files.

Data Transformation :

Different transformations are applied depending on the nature of variables: Numerical, Categorical, Ordinal

Numeric variables: SimpleImputer transformation (with strategy = median) is applied to impute the missing values with median value and StandardScaler is applied to scale the values with 0 mean and unit standard deviation. Categorical Variables SimpleImputer transformation (with strategy = most_frequent) is applied to impute the missing values with the most frequent values, OneHotEncoder,StandardScaler is applied.

These transformations are applied over the numerical and categorical variables and passed in a pipeline created using ColumnTransformer class.

This preprocessor object is saved as a pickle file for future use.

Model Training :

GridSearchCV is used to find the optimal values of the hyperparameters of different mdoels like Linear regression, Decision Tree, Random Forest, AdaBoost, CatBoost. The best model is selected which gave the highest r2_score and was used to predict the target value. This model object is saved as a pickle file for future use.

Prediction Pipeline :

This pipeline is used to get the input user data, convert data to dataframe and make prediction using preprocessor and model pickle files.

Flask App creation :

A simple Flask app is created with UI to get the input user data and predict the final math score.

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
Notebook		Notebook
artifacts		artifacts
catboost_info		catboost_info
src		src
templates		templates
.DS_Store		.DS_Store
.gitignore		.gitignore
README.md		README.md
app.py		app.py
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ML Project - Student Performance Prediction

Steps followed in the project:

About

Releases

Packages

Languages

shyamala-venkat/ML_Project_Student_Performance_Prediction

Folders and files

Latest commit

History

Repository files navigation

ML Project - Student Performance Prediction

Steps followed in the project:

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages