Using Catboost for Flu Shot Learning: Predict H1N1 and Seasonal Flu Vaccines competition hosted by drivendata.org. #1332

SaiAyachit · 2020-06-12T21:52:53Z

Tried using catboost in the Flu Shot Learning: Predict H1N1 and Seasonal Flu Vaccines competition hosted by drivendata.org.
The problem was a multi-label classsification , the goal is to predict the probablity of the individuals receiving their H1N1 and seasonal flu vaccines. The dataset included 35 features and the evaluation metric was roc_auc_score. The training and testing data alongwith the submission format file can be found on the website : https://www.drivendata.org/competitions/66/flu-shot-learning/page/211/

Since, the data had alot of categorical columns with upto 22 unique levels of categories, i wanted to try catboost and it worked wonders even with just the basic setting and ranked 29th. Later, with a advanced bagging approach, it ranked 24th with the submission score of 0.8620. The results were very exciting for me because no other boosting algorithm performed as good as catboost. Catboost not only gave the highest score but also had the least training and prediction time than the rest.

No other algorithm has worked so good with categorical data and it was very easy to implement also, there was no need to encode the categorical values and no need of alot of data preprocessing just eliminating nan values is enough.

I am very happy to use this library and thought i would share my experience. Hope turns out to be useful for others. This is the first time i'm contributing to an open source project, please bear with my mistakes and let me know of the changes. I'll be available at : [email protected]

I hereby agree to the terms of the CLA available at: https://yandex.ru/legal/cla/?lang=en

Before submitting a pull request, please do the following steps:

Read instructions for contributors.
Run ya make in catboost folder to make sure the code builds.
Add tests that test your change.
Run tests using ya make -t -A command.
If you haven't already, complete the CLA.

Tried using catboost in the Flu Shot Learning: Predict H1N1 and Seasonal Flu Vaccines competition hosted by drivendata.org. The problem was a multi-label classsification , the goal is to predict the probablity of the individuals receiving their H1N1 and seasonal flu vaccines. The dataset included 35 features and the evaluation metric was roc_auc_score. The training and testing data alongwith the submission format file can be found on the website : https://www.drivendata.org/competitions/66/flu-shot-learning/page/211/ Since, the data had alot of categorical columns with upto 22 unique levels of categories, i wanted to try catboost and it worked wonders even with just the basic setting and ranked 29th. Later, with a advanced bagging approach, it ranked 24th with the submission score of 0.8620. The results were very exciting for me because no other boosting algorithm performed as good as catboost. Catboost not only gave the highest score but also had the least training and prediction time than the rest. No other algorithm has worked so good with categorical data and it was very easy to implement also, there was no need to encode the categorical values and no need of alot of data preprocessing just eliminating nan values is enough. I am very happy to use this library and thought i would share my experience. Hope turns out to be useful for others. This is the first time i'm contributing to an open source project, please bear with my mistakes and let me know of the changes. I'll be availabel at : [email protected]

…examples using catboost for multi label classification; Flu Shot Learning: Predict H1N1 and Seasonal Flu Vaccines competition hosted by drivendata.org. The data had alot of categorical values.

review-notebook-app · 2020-06-12T21:52:59Z

Check out this pull request on

Review Jupyter notebook visual diffs & provide feedback on notebooks.

Powered by ReviewNB

SaiAyachit added 2 commits June 13, 2020 03:09

Merge pull request #1 from SaiAyachit/SaiAyachit-patch-1-competition-…

876d831

…examples using catboost for multi label classification; Flu Shot Learning: Predict H1N1 and Seasonal Flu Vaccines competition hosted by drivendata.org. The data had alot of categorical values.

georgthegreat force-pushed the master branch from 8a97ab9 to 743cf89 Compare February 2, 2023 17:25

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Using Catboost for Flu Shot Learning: Predict H1N1 and Seasonal Flu Vaccines competition hosted by drivendata.org. #1332

Using Catboost for Flu Shot Learning: Predict H1N1 and Seasonal Flu Vaccines competition hosted by drivendata.org. #1332

SaiAyachit commented Jun 12, 2020

review-notebook-app bot commented Jun 12, 2020

Using Catboost for Flu Shot Learning: Predict H1N1 and Seasonal Flu Vaccines competition hosted by drivendata.org. #1332

Are you sure you want to change the base?

Using Catboost for Flu Shot Learning: Predict H1N1 and Seasonal Flu Vaccines competition hosted by drivendata.org. #1332

Conversation

SaiAyachit commented Jun 12, 2020

review-notebook-app bot commented Jun 12, 2020