Skip to content
View igorvgp's full-sized avatar
Block or Report

Block or report igorvgp

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
igorvgp/README.md

Welcome to my Data Science Portfolio

Through this page, I demonstrate my skills in solving business challenges through my knowledge and tools of Data Science.

Ígor Pereira

Data Scientist/Analyst

I am a Mechanical Engineering student, and I had my first experience with data working on the Technology team of a Public Health organization. During my experience there, I was tasked with executing Data Analysis projects that involved tracking and monitoring Dengue cases in the city, managing exams orders, and following up on cases of Covid-19. This experience allowed me to develop my understanding of the key data analysis tools and how to use them to generate insights that could encourage data-driven decision making.

Additionally, I have had the opportunity to work in the Logistics department of an automobile industry, where my responsibilities included generating customer demand, inventory, and purchase order reports through data analysis. To this end, I developed efficient data extraction and transformation processes that made relevant logistic key performance indicators (KPIs) easily accessible through interactive dashboards. Moreover, during this experience, I spearheaded several data flow automation projects, which helped optimize work performance and streamline access to important information for the team.

Currently, I am a member of a Data Science community where I run personal projects in order to gain experience in solving business problems using data analysis concepts and tools. In this community, I have the opportunity to share knowledge with other members by participating in study groups, monitoring, discussions and competitions.

Skills:

  1. Programming Languages:

    • Python for Data Analysis
      • Linear Algebra and Data Manipulation: Pandas, Numpy
      • Data Vizualization: Streamlit, Matplotlib, Seaborn, Plotly
      • Machine Learning: Scikit-Learn, Scipy ( Classification, Regression and Clustering)
      • Web Scraping: Beautiful Soup, Selenium
      • APIs building: Flask
    • SQL
  2. Data Vizualization Tools:

    • Power BI
    • Tableau
    • Google Looker Studio
  3. Machine Learning Deployment:

    • AWS(S3, EC2, RDS)
    • Heroku
    • Render
  4. Other tools:

    • Microsoft Excel

Links:

  • Linkedin Badge
  • Gmail Badge

Data Science Projects:

An Insurance company that has provided Health Insurance to its customers need to predict whether the policyholders (customers) from past year will also be interested in Car Insurance provided by the company. With the information about customers, the company did a survey asking them if they were interested in car insurance. With the results of this survey and the characteristics of the customers, the company is able to maximize profit for this product through Machine Learning techniques.

In this context, I developed a Learning to Rank Machine Learning model that is able to rank customers by their propensity to buy auto insurance, so salespeople can target customers who are most likely to buy it.

Compared to a random selection of customers to be contacted, the machine learning model developed proved to be about 3 times more efficient, generating an extra gain of 35 million dollars.

An E-commerce company wants to implement a loyalty program for its most valuable customer group called "Insiders", so the marketing team can provide benefits to this group and encourage other customers to join it. To find out who are those customers, the company provided a dataset contains all the transactions occurring between Nov-2016 to Dec-2017 to investigate.

For this problem, an unsupervised Machine Learning model was developed to cluster customers based on their similarities. By doing so, it was possible to identify a subset of high-value customers, who made up just 15.7% of the total customer base, but contributed but contribute with 51.7% of the total revenue.

The European pharmacy chain Rossmann plans to allocate a portion of their budget towards renovating their stores. In order to calculate the amount that will be dedicated to this purpose, the company's CFO requested that the data team develop a sales revenue forecasting solution for the next 6 weeks.

A regression model using Machine Learning was developed and was able to achieve a MAPE (mean absolute percentage error) of 14%. This model predicts a sales value of $283.7M for the next 6 weeks.

Pinned

  1. Data-Analysis-Enem-2017 Data-Analysis-Enem-2017 Public

    Insights project on Jupyter Notebook using Python and Data visualization with Microsoft Power BI

    Jupyter Notebook

  2. data_analysis_house_rocket data_analysis_house_rocket Public

    House Rocket's strategic area wants to find the best opportunities for buying and selling real estate to maximize the company's profits.

    HTML

  3. previsao_alugueis previsao_alugueis Public

    Desenvolvimento de uma solução End-to-end de um modelo de previsão de valores de alugueis de imóveis em São Paulo

    Jupyter Notebook

  4. sistema_recomendacao_livros sistema_recomendacao_livros Public

    Desenvolvimento de um sistema de recomendação de livros utilizando o algoritmo de Machine learning K-nearest neighbors

    Jupyter Notebook