Titanic : Visualization & Prediction

Predict survival on the Titanic and get familiar with ML basics. ( ⭐️ Star us on GitHub — it helps! )

Getting started with competitive data science can be quite intimidating. So I build this notebook for quick overview on Titanic: Machine Learning from Disaster competition. For your convenience, please view it in kaggle.

I encourage you to fork this kernel/GitHub repo, play with the code and enter the competition. Good luck!

Requirements

This project requires Python 2.7 and the following Python libraries installed:

You will also need to have the software installed to run and execute an iPython Notebook

Code

An ipython notebook is used for data preprocessing, feature transforming and outlier detecting. All core scripts are in file .ipynb" folder. All input data are in input folder and the detailed description of the data can be found in Kaggle.

Key features of the model training process

K Fold Cross Validation: Using 5-fold cross-validation.

First Level Learning Model: On each run of cross-validation tried fitting following models :-

Random Forest classifier
Extra Trees classifier
AdaBoost classifer
Gradient Boosting classifer
Support Vector Machine

Second Level Learning Model : Trained a XGBClassifier using xgboost

Content in Notebook

Data Preprocessing
Exploratory Visualization
Feature Engineering
1. Value Mapping
2. Simplification
3. Feature Selection
4. Handling Categorical Data
Modeling & Evaluation
1. Trying Different Model without Validation
2. Cross-validation method
3. Model scoring function
4. Setting Up Models
Train & Fit Model
Our Base First-Level Models
Second-Level Predictions From The First-Level Output
Output as Prediction file ( .csv)
Acknowledgments

FlowChart

Prediction & Submission

The modal comparison with cross validation for first output layer :

The final price prediction for each house is present in the output folder as a .csv file. The final model used for scoring is hypertuned XGBoost Classifier with Cross Validation.

The final XGBoost Classifier can be viewed as :

Contributors

Rohit Kumar Singh (IIT Bombay)

Feedback

Feel free to send us feedback on file an issue. Feature requests are always welcome. If you wish to contribute, please take a quick look at the kaggle.

Acknowledgments

Inspirations are drawn from various Kaggle notebooks but majorly motivation is from the following :

Credit for image to https://miro.medium.com/

Written with StackEdit.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
fig		fig
file .ipynb		file .ipynb
input		input
output		output
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fig

fig

file .ipynb

file .ipynb

input

input

output

output

LICENSE

LICENSE

README.md

README.md

Repository files navigation

Titanic : Visualization & Prediction

Requirements

Code

Key features of the model training process

Content in Notebook

FlowChart

Prediction & Submission

Contributors

Feedback

Acknowledgments

Credit for image to https://miro.medium.com/

About

Releases 1

Packages

Languages

License

RohitLearner/Titanic

Folders and files

Latest commit

History

Repository files navigation

Titanic : Visualization & Prediction

Requirements

Code

Key features of the model training process

Content in Notebook

FlowChart

Prediction & Submission

Contributors

Feedback

Acknowledgments

Credit for image to https://miro.medium.com/

About

Topics

Resources

License

Stars

Watchers

Forks

Languages