Skip to content

deep-diver/Data-Analysis-on-Titanic

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 

Repository files navigation

Data Analysis on Titanic Data Set

Contents

  • About Project
  • About the Data Set
  • Resources
  • Dependencies

About Project

This project has 2 different purposes. The first one is to pratice performing data analysis, especially on well-knwon data set so that it is easier to be proven myself. The scond purpose is to be familiar with python when performing data analysis including using plotting, math, data frame libraries. The end goal of this project is mainly focused on predicting a passenger's survivability.

About the Data Set

The data set includes 891 entries (observations), and each entry has 11 different variables to describe a passenger.

  • survival : Survival (0 = No, 1 = Yes)
  • pclass : Ticket class (1 = 1st/Upper, 2 = 2nd/Middle, 3 = 3rd/Lower)
  • sex : Sex
  • Age : Age in years
  • sibsp : # of siblings / spouses aboard the Titanic (siblings = brother/sister/stepbrother/stepsister, spouse = husband/wife)
  • parch : # of parents / children aboard the Titanic (parent = mother/father, child = daughter/son/stepdaughter/stepson)
  • some children travelled only with a nanny, therefore parch=0 for them
  • ticket : Ticket number
  • fare : Passenger fare
  • cabin : Cabin number
  • embarked : Port of Embarkation (C = Cherbourg, Q = Queenstown, S = Southampton)

Resources

  • titanic-analysis.ipynb: IPython (Jupyter Notebook)
  • titanic-data.csv: Titanic Data set formatted in csv style

Dependencies

  • pandas: data frame library
  • numpy: mathmatical library + somewhat data frame library
  • matplotlib: plotting library

About

applying data analysis on titanic data sheet

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published