GitHub - flaviabpscht/analise_projeto_3_parte2: Create models to classify customers as either ‘churn’ or ‘non-churn,’ contributing valuable insights

Projeto 3 - Classificação foi dividido em duas partes. Parte 2 traz a Feature Engineering, escolha e aplicação do melhor modelo de Machine Learnig

Project 3 - Classification was divided into two parts. Part 2 brings Feature Engineering, choosing and applying the best Machine Learning model

Motivação

Muitos problemas da vida real na carreira de cientista de dados são modelados como classificação. Quando nossa variável dependente é discreta, temos uma classificação. As classes podem ser somente duas (variável dependente binária) ou problemas multiclasse.

Apesar de criar o modelo, treinar, validar e testar serem etapas imposrtantes. Escolher a melhor forma de tratar dados faltantes (missing data), realizar uma boa engenharia de features (feature engineering) e selecionar a melhor métrica para cada projeto são partes essenciais e que requerem uma certa "criatividade" de nossa parte.

Motivation

Many real-world problems in the data scientist’s career are modeled as classification. When our dependent variable is discrete, we have a classification. The classes can be only two (binary dependent variable) or multiclass problems.

Despite creating the model, training, validating, and testing being important steps, choosing the best way to handle missing data, performing good feature engineering, and selecting the best metric for each project are essential parts that require a certain ‘creativity’ on our part.

Objeto de Estudo

Um problema muito recorrente para muitas empresas é qual a melhor forma de reter seus clientes. Saber de antemão se um cliente vai cancelar os serviços é uma grande vantagem competitiva para qualquer empresa. A estratégia de marketing, CRM e as equipes de vendas podem se beneficiar muito se tiver informações de quais clientes tem mais chances de deixar de contratar os serviços de uma empresa.

Esse tipo de problema é chamado de previsão de churn (de churn rate, ou, % de clientes que deixam a empresa num determinado tempo). Para resolver esse tipo de problema precisamos ter uma base histórica com clientes que saíram e não saíram da empresa, bem como suas características.

Bancos, telefônicas, varejo, qualquer empresa que presta algum tipo de serviços e possui informações sobre seus clientes pode se beneficiar de modelos preditivos similares aos que iremos construir.

Nesse projeto, serão criados modelos para classificar os clientes em churn ou 'não churn' contribuindo com dados importantes para a Let's Talk (empresa telefônica da holding Let's Data) manter seus clientes.

A base utilizada pode ser obtida no Kaggle e se trata de uma empresa telefônica fictícia com dados demográficos e de serviços contratados pelos clientes com a informação se saiu ou não da empresa.

Object of Study

A very common problem for many companies is determining the best way to retain their customers. Knowing in advance whether a customer is likely to cancel services provides a significant competitive advantage for any company. Marketing strategies, CRM, and sales teams can greatly benefit from having information on which customers are more likely to stop using a company’s services.

This type of problem is called churn prediction (based on the churn rate, which represents the percentage of customers leaving the company within a certain time frame). To address this issue, we need a historical database with information on customers who have both stayed and left the company, along with their characteristics.

Banks, telecommunications companies, retailers—any business that provides services and has customer data—can benefit from predictive models similar to the ones we will build.

In this project, we will create models to classify customers as either ‘churn’ or ‘non-churn,’ contributing valuable insights to Let’s Talk (a telecommunications company under the Let’s Data holding) in retaining its customers.

The dataset used can be obtained from Kaggle and represents a fictional telecommunications company, including demographic data and details of services contracted by customers, along with whether they stayed with the company or not."

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
.ipynb_checkpoints		.ipynb_checkpoints
data		data
Projeto_3_parte_2.ipynb		Projeto_3_parte_2.ipynb
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Projeto 3 - Classificação foi dividido em duas partes. Parte 2 traz a Feature Engineering, escolha e aplicação do melhor modelo de Machine Learnig

Project 3 - Classification was divided into two parts. Part 2 brings Feature Engineering, choosing and applying the best Machine Learning model

Motivação

Motivation

Objeto de Estudo

Object of Study

About

Releases

Packages

Languages

flaviabpscht/analise_projeto_3_parte2

Folders and files

Latest commit

History

Repository files navigation

Projeto 3 - Classificação foi dividido em duas partes. Parte 2 traz a Feature Engineering, escolha e aplicação do melhor modelo de Machine Learnig

Project 3 - Classification was divided into two parts. Part 2 brings Feature Engineering, choosing and applying the best Machine Learning model

Motivação

Motivation

Objeto de Estudo

Object of Study

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages