Sleep Health and Lifestyle Analysis

Overview

This project analyzes sleep patterns and their correlation with various health metrics using machine learning techniques. By examining the relationship between sleep duration, stress levels, and other health indicators, the project aims to predict the presence of sleep disorders among individuals. This analysis can aid in identifying high-risk populations and informing interventions.

Dataset

The dataset contains various features related to sleep health and lifestyle choices. Key columns include:

Gender: Categorical variable indicating gender.
Occupation: Categorical variable indicating occupation.
BMI Category: Categorical variable indicating Body Mass Index classification.
Blood Pressure: Numerical value indicating blood pressure.
Sleep Disorder: Target variable indicating the presence of a sleep disorder.
Sleep Duration: Numerical value representing average sleep duration.
Stress Level: Numerical value indicating daily stress level.

Example Data

Gender	Occupation	BMI Category	Blood Pressure	Sleep Disorder	Sleep Duration	Stress Level
Male	Engineer	Normal	120	No	7	3
Female	Teacher	Overweight	135	Yes	5	7

Data Preprocessing

Steps Taken:

Loading the Data: The dataset is loaded using Pandas, providing a DataFrame structure for easier manipulation.
Handling Missing Values: A preliminary check is conducted to identify and quantify missing values across all columns, with potential strategies for imputation discussed.
Encoding Categorical Variables: Categorical variables are converted to numerical codes using astype('category').cat.codes, facilitating machine learning model training.
Feature Scaling: Although not explicitly included, future iterations could benefit from scaling numerical features to improve model convergence.

Exploratory Data Analysis (EDA)

EDA is conducted to gain insights into the dataset:

Histograms: Visualize the distribution of sleep duration with a Kernel Density Estimate (KDE) overlay.
Box Plots: Highlight potential outliers in sleep duration.
Scatter Plots: Examine relationships between sleep duration and stress levels.
Correlation Heatmap: Analyze correlations among features.

Statistical Analysis

Correlation Metrics

Pearson Correlation: Measures linear relationships between sleep duration and blood pressure.
Spearman Correlation: Assesses monotonic relationships.

Regression Analysis

A regression plot visualizes the relationship between sleep duration and blood pressure.

Machine Learning Models

Multiple models are employed to predict sleep disorders:

Logistic Regression
Decision Tree Classifier
Random Forest Classifier
AdaBoost Classifier
Recursive Feature Elimination (RFE)

Model Evaluation

Models are evaluated using:

Classification Reports: Detailing precision, recall, F1-score, and support for each class.
ROC AUC Scores: Provides a single metric for assessing model performance.

Hyperparameter Tuning

Grid Search is applied to optimize hyperparameters for the Random Forest Classifier.

Confusion Matrix Visualization

Confusion matrices are generated for each model to visualize performance metrics, including true positives, false positives, true negatives, and false negatives.

Sample Confusion Matrix

Requirements

To run this project, you will need the following Python libraries:

pandas
numpy
matplotlib
seaborn
scipy
scikit-learn

Install the required packages using:

pip install -r requirements.txt

How to Run the Code

Clone the repository to your local machine:

git clone https://github.com/aboodcs/SleepHealthAnalysis
cd SleepHealthAnalysis

Conclusion

This project demonstrates the application of data analysis and machine learning techniques to investigate sleep health patterns. It highlights the importance of sleep duration in relation to health indicators and the effectiveness of various models in predicting sleep disorders. Insights derived from this analysis could inform public health strategies aimed at improving sleep health.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
README.md		README.md
confusion_matrix_image.png		confusion_matrix_image.png
dataset.csv		dataset.csv
requirements.txt		requirements.txt
sleep_health.ipynb		sleep_health.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Sleep Health and Lifestyle Analysis

Table of Contents

Overview

Dataset

Example Data

Data Preprocessing

Steps Taken:

Exploratory Data Analysis (EDA)

Statistical Analysis

Correlation Metrics

Regression Analysis

Machine Learning Models

Model Evaluation

Hyperparameter Tuning

Confusion Matrix Visualization

Sample Confusion Matrix

Requirements

How to Run the Code

Conclusion

About

Releases

Packages

Languages

aboodcs/SleepHealthAnalysis

Folders and files

Latest commit

History

Repository files navigation

Sleep Health and Lifestyle Analysis

Table of Contents

Overview

Dataset

Example Data

Data Preprocessing

Steps Taken:

Exploratory Data Analysis (EDA)

Statistical Analysis

Correlation Metrics

Regression Analysis

Machine Learning Models

Model Evaluation

Hyperparameter Tuning

Confusion Matrix Visualization

Sample Confusion Matrix

Requirements

How to Run the Code

Conclusion

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages