Skip to content

Dhanshreeb05/Titanic

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 

Repository files navigation

Titanic

TheHere is an approach we followed in making the model

1.Studying the neccessary Libraries and importing them

Numpy and Pandas are used manipulating the dataframe and its columns and cells matplotlib along with Seaborn to visualize our data.

2. Loading and Viewing Data Set to study the features available

We load both the training and test data set to take a look at our data table to see the values that we'll be working with

3. Filling the NaN values

The NaN values of age are filled by the mean of the ages. The NaN values of fares are filled by the mean of the fares

4. Studying the data by plotting and visualization

All the features of train data were plotted against survived column to study and then it was observed that passengers having Pclass as 1 had most chances of surviving and Pclass 3 having least. Even Female were observed too survive more than male. People having less SibSp and Parch has more chances of survival

5.Feature Engineering

Because values in the Sex and Embarked columns are categorical values, we have to represent these strings as numerical values in order to perform our classification with our model

These is done by creating a new column Place and mapping {'S': 1, 'C': 2, 'Q': 3} In column sex male is mapped to 2 and female to 1 and a new column Gender is made In column of age and fare mapping is done for ease of classification according to order of survival and new column named A is formed

A new column Family is made by adding SibSp and Parch which tells the total number of family members on board for each member.

The column PGA is made by multiplying Pclass Gender and A

The titles of name are also being extracted and mapped with numbers. The title of their names like Ms. or Mr may also provide a hint as to whether the passenger survived or not.

6.Model Fitting and Predicting

We import different classifiers from sklearn and predict the accuracy of the model Lastly the survived column of the test dataset is being predicted.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published