To be financially stable is everyone’s dream. However, the rise in medical costs and treatment for illness can strain your savings. Health insurance helps lessen the costs of medical expenses in the event of an illness or accident and for preventive medicine such as routine medical tests, check-ups, and screening tests. Cashless treatment, Pre and post-hospitalization cost coverage, Transportation facility, No Claim Bonus (NCB), Medical checkup, Room rent, and Tax benefits are the main benefits. Health insurance is typically offered as one to three years long contracts and required a renewal based on the chosen plan.
- The objective is to predict insurance policy renewals for existing customers using predictive modeling.
- To segment existing customers for better targeting through focused marketing strategies.
These are the Software, Tools, and Environments used in the project.
HTML, CSS, JS: Cascading Style Sheets (CSS) are used for presenting documents written in a markup language such as HTML. CSS is a cornerstone technology of the World Wide Web, alongside HTML and JavaScript.
Flask: Flask helps end users interact with your Python code (in this case our ML models) directly from their web browser without needing any libraries, or code files.
Tableau: A tool we used for visualization.
MySQL: A tool used for storing databases.
Heroku: A tool used for Deploying the model.
The first step in data understanding is Data Collection. The feature set used in this project was taken from the MySQL database, containing 42 features related to the health insurance renewal policy. The output variable for this project is “Renewal: Yes or No”, which is a discrete data type. So this project will focus on the classification machine learning algorithm.
- Data Cleansing is a primary process that needs to be worked on after data collection.
- We have performed outlier treatment on features that had outliers, as outliers were affecting the mean values.
- Dummy variables were also created for categorical variables using Label Encoding.
- We have also performed Standardization of data.
- Business Moments Decisions and graphical interpretation of data are performed before and after data cleansing to analyze the statistics of the data
- Visualization of univariate and bivariate plots was done in Python and Tableau.
- Shallow Model(KNN, Naive Bayes, Decision Tree)
- Ensemble Model(Random Forest)
- Regression Model(Logistic Regression)
- Support Vector Machine(SVM)
- Artificial Neural Network(ANN)
- Hierarchical Clustering
- Density-Based Clustering of Application with Noise
- K-Means Clustering
Model Hyper-parameters used:
- Cross Validation
- GridSearchCV
- RandomSearchCV
Model Accuracy Measures :
- Confusion matrix
- Accuracy
- F1 score
- ROC (Receiver Operating Characteristics) curve & AUC (Area Under Curve)
-
Flask :
-
Flask is a micro-framework for building web applications in Python. It began as a simple wrapper around Wekzeug(WSGI protocol) and Jinja and has become
the most popular Python web application Framework. -
Flask and Green-unicorn module must be installed in the project environment using “pip install flask gunicorn”.
-
Gunicorn is a Python WSGI HTTPS Server that uses a worker model.
-
Heroku : 2.1 Heroku is a cloud platform as a service (PaaS) supporting several programming languages.
- This project is an exploratory attempt to understand the factors, which affect renewal decisions in the health insurance market.
- The prediction model helped us to figure out features that contributed more to the renewal of health insurance policies.
- Customer segmentation provided further insights into the business which would provide a targeted marketing approach.
- The results also suggest customer satisfaction is a significant factor in influencing the renewal decision of policyholders.