Skip to content

Marouane666/Titanic

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Titanic Data Science Solution

The notebook walks us through a typical workflow for solving data science competitions at sites like Kaggle

Titanic-Transparent-PNG

Workflow stages

  1. Data importation

  2. Data visualization and analysis

  3. Machine Learning

The workflow indicates general sequence of how each stage may follow the other

  1. We will import the dataset with all the libraries that we will work and manipulate the data with

  2. We will visualize the pattern in the data and the correlation within

  3. We will use the preprossed data to predict the test set with differente models

Titanic_plans

Workflow goals

  1. We may want to classify or categorize our samples. We may also want to understand the implications or correlation of different classes with our solution goal.

  2. We may also want to determine correlation among features other than survival for subsequent goals and workflow stages. Correlating certain features may help in creating, completing, or correcting features

  3. Depending on the choice of model algorithm one may require all features to be converted to numerical equivalent values.

  4. Data preparation may also require us to estimate any missing values within a feature

  5. We may also analyze the given training dataset for errors or possibly innacurate values within features and try to corrent these values or exclude the samples containing the errors

  6. Can we create new features based on an existing feature or a set of features, such that the new feature follows the correlation, conversion, completeness goals.

  7. How to select the right visualization plots and charts depending on nature of the data and the solution goals.

Stöwer_Titanic

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published