Skip to content

Using Kaggle Datasets of Titanic Machine Learning Competition to predict the survivors

License

Notifications You must be signed in to change notification settings

digs1998/Titanic-Survivors-predictions

Repository files navigation

Titanic-dataset

Using the titanic data to predict the survival of the passengers. WorkFlow of the project (work still in progress)

  1. Loading Libraries a. Numpy b. Pandas c. Matplotlib and seaborn d. sklearn for accuracy and algorithms with data-preprocessing purposes

  2. Exploratory Data Analysis -Exploring the data like how many rows and columns shape of training and testing data, finding the missing values in the dataset

-Dummy encoding done on the categorical data.

-For Certain algorithms to work we must normalize the data so I have normalized using StandardScaler method

  1. Training and Testing of Data importing KNN, GaussianNB, DecisionTree etc.. libraries, train_test_split library for model selection and to avoid overfitting of the model used.

Optional- Data Visualization tried making notebook more interactive

Work in Progress!! got 0.77 accuracy so far, will be improving it.

To get a better understanding of the workflow of a Machine Learning project, have a read:

  1. sklearn documentation is also recommended.

  2. https://medium.com/analytics-vidhya/workflow-guide-to-machine-learning-c0545c843f04 (My blog on machine learning do read it!!)

  3. https://medium.com/@NotAyushXD/workflow-of-a-machine-learning-project-ec1dba419b94

  4. https://www.kaggle.com/digvijayyadav/titanic-codesprediction (Do upvote it if you like my kernel)

About

Using Kaggle Datasets of Titanic Machine Learning Competition to predict the survivors

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published