pytanic

Data Visualization using Python

RMS Titanic, known for its infamous shipwreck in the North Atlantic Ocean on 15 April 1912. Among the deadliest tragediest of all time, killing more than 1500 poeple of the estimated 2224 passengers and crew. The disaster drew much public attention, which not only led to better safety guidelines for ships but also provided foundational material for the disaster film genre.

The dataset contains the details of only 891 passengers.

Project Components

Data Inspection
Data Cleaning
Data Visualization

Overview of the Dataset

Given table contains the details of the columns along with their parameters, which is crucial for the understanding of the data analyst working with the dataset.

Variable	Attributes / Definition	Meaning if any
Survival	0 1	No Yes
pclass	1 2 3	Class A Class B Class C
Sex	F M	Female Male
Age	Age in years
sisbsp	Sibling Spouse	brother, sister, stepbrother, stepsister husband, wife (mistresses and fiancés were ignored)
parch	Parent Child	mother, father daughter, son, stepdaughter, stepson
ticket	ticket number
fare	passenger fare
cabin	cabin number
embarked	Port of Embarkation C Q S	Cherbourg Queenstown Southampton

NOTE

In other projects you would notice that the analyst has two .csv files namely, train.csv and test.csv.

test.csv ➨ used for testing the model generated.
train.csv ➨ used for training the model with the dataset we work on.

The conclusive values and end results made by models also varies with the percentage of dataset alloted for each of the two .csv files.
Which means that we may have different results when the data alloted for train.csv and test.csv is 50-50 as opposed to a case where it is 70-30

Whereas in my project there is only one csv file, because I have decided not to divide my dataset in any manner and work with the dataset in it's entirity.

Conclusion

These the following conclusion we can make after analysing the following data.

Most passengers were travelling to
Women were given priority during the evaculation
The chances of survival was correlated to the fare paid by each passenger

You can see the online deployment of the notebook by clicking on this link

Links to all the resources from where I learnt the following

1. https://medium.com/analytics-vidhya/data-visualization-titanic-data-set-91531c3ab5a6
2. https://medium.com/@rohanhgupta91/analyze-titanic-dataset-of-kaggle-ab220334b75c
3. https://medium.com/analytics-vidhya/what-is-the-difference-between-training-and-test-dataset-d20820e5f632
4. https://towardsdatascience.com/machine-learning-with-the-titanic-dataset-7f6909e58280
5. https://www.kaggle.com/subinium/awesome-visualization-with-titanic-dataset
6. https://www.kaggle.com/startupsci/titanic-data-science-solutions/
7. https://github.com/abhishekchhibber/Titanic-Data-Visualization
8. https://mastermindlab.github.io/titanic/
9. https://harvard-iacs.github.io/2019-CS109A/labs/lab-5/student/

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
.ipynb_checkpoints		.ipynb_checkpoints
.gitignore		.gitignore
Data Visualization with Titanic.ipynb		Data Visualization with Titanic.ipynb
LICENSE		LICENSE
README.md		README.md
Titanic-DataSet.csv		Titanic-DataSet.csv
_config.yml		_config.yml
index.html		index.html

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

pytanic

Project Components

Overview of the Dataset

NOTE

Conclusion

About

Languages

License

anushkaguptaaa/pytanic

Folders and files

Latest commit

History

Repository files navigation

pytanic

Project Components

Overview of the Dataset

NOTE

Conclusion

About

Resources

License

Stars

Watchers

Forks

Languages