Skip to content

Trained a model that estimates if a lead is likely to be converted based on lead behavior in historical customer data using ML.

License

Notifications You must be signed in to change notification settings

abhijitpai000/predictive_lead_scoring

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

40 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Predictive Lead Scoring using ML

Overview:

Predictive Lead Scoring is a method used to analyze lead behavior in historical customer data to find patterns resulting in a positive business outcome, such as a closed deal with a client. In this study, I developed a lead scoring model using the Bank Marketing dataset, which contains the outcome of clients subscribing to a term deposit or not based on a direct marketing campaign performed by a Portuguese bank.

Model Design:

Since the evaluation of a classification model is tricky, I assumed that Losing a potential customer cost > Sales Resource Cost as a business objective, which statistically translates to developing a model that gives Low False Negatives and High True Positives, with balancing False Positives.

Model Outcome:

To mimic a real-time model evaluation, I separated ~10,000 observation points from the dataset and trained on ~30,000 observation points. The following is the result of my trained LightGBM model on the hold-out dataset.

75.04% of Leads predicted by the model have resulted in conversion. And, a 29.56% False Positive rate, 24.95% False Negative Rate is observed.

Segmenting Leads based on Model Predictions:

Data Source

UCI Machine Learning Repository - Bank Marketing dataset.

Final Report & Package Walk-Through

To reproduce this study, use modules in 'src' directory of this repo. (setup instructions below) and walk-through of the package is presented in the final report

Setup instructions

Creating Python environment

This repository has been tested on Python 3.7.6.

  1. Cloning the repository:

git clone https://github.com/abhijitpai000/predictive_lead_scoring.git

  1. Navigate to the git clone repository.

cd predictive_lead_scoring

  1. Download raw data from the data source link and place in "datasets" directory

  2. Install virtualenv

pip install virtualenv

virtualenv lead_scoring

  1. Activate it by running:

lead_scoring/Scripts/activate

  1. Install project requirements by using:

pip install -r requirements.txt

Note

  • For make_dataset(), please place the raw data (bank-additional -> bank-additional-full.csv from data source) in the 'datasets' directory.

About

Trained a model that estimates if a lead is likely to be converted based on lead behavior in historical customer data using ML.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages