Skip to content

Richieone13/portfolio

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Richie Wong's Data Science Portfolio

Repository containing portfolio of data science projects completed for academic, self learning, and professional purposes. Presented in the form of Jupyter Notebooks.

If you liked what you saw, want to have a chat with me about the portfolio, work opportunities, or collaboration, feel free to contact me on: - LinkedIn

Personal Website: http://richiewong.co.uk/

Python Libaries: Pandas, numpy, Matplotlib, Seaborn, Scikit Learn, Scipy, Stats, GeoPandas, Bokeh, Folium


Machine Learning Classification Problems

Exploratory Data Analysis Customers Profile

Predicting Customer Churn based on Profile

Predicting customer churn for a telecommunication technology company and applying various ML models. I focus on understanding the demographics, profile, and serivces that can influence customer churning and applyng that to build an effective and practical ML model.


Predicting if a Loan Application is Successful

Classification problem to predict whether the applicant will be eligible for a loan. Exploratory Data Analysis, 80% accuracy to predict whether the applicant is eligable for a loan using Logisitic Regression-CrossValidation. Dataset Size: 614 entries with set of 13 features.



Prediciting if the user likes these songs:

Classification problem to predict whether the likes or dislikes a song, exploratory Data Analysis, 72% accuracy to predict whether the person likes or dislikes a song using Decision Tree Model. Size of Dataset: 2,017 entries with set of 17 features.


Machine Learning Regression Problems

House Price Prediction

Exploratory Data Analysis of the Califorina Housing Market. Comparison of performance in different ML models, incl. Linear Regression Model, SVR, DecisionTree and RandomForest. Undertaking: Data Cleaning, One-Hot Encoding, CrossValidation, GridSearch, RandomizeGridSearch.


Statistical Hypothesis Testing

A/B Testing Webpage Design for Conversions

A/B testing on two different designs the control and experiemental landing pages. Note - this is a simple experiment which does not factor in the funnel and journey to conversion or track bounce rate etc.


Data Discovery


Airbnb New York EDA and Predictive Modelling

Opportunity to work on complete ready dataset (~49,000 listings) and finding interesting trends - incl. finding the market value of the rent for appartment + shared room between the different boroughs within New York. Exploring different libaries: GeoPandas, Bokeh, Folium



Exploratory Data Analysis 🏀 College Basketball

Exploratory Data Analysis in USA College Basketball (2015-2019) dataset. From a young age my passion is playing and watching basketball. The motivation is to learn the history of college basketball and what is the formula for success within a team.



EDA + Feature Engineering - Golden Globe Awards

Being a fan of movies myself, I wanted to learn more about the Golden Globe awards and trends. Here I focus on feature engineering to be able to gather more meaningful insight of the different types of awards and succesful movies, directors and actors/actresses.



Exploring the TEDTalk Dataset

Exploratory Data Analysis on the TedTalk Dataset from 2017. Using the panda library to answer questions about a real-world datasets - "best practices" for using pandas. Answering interesting questions like: What were the "best" events in TED history to attend? Which TedTalk provoke the most online discussion? Which occupations deliver the funniest TED talks on average?


Power BI Reports


London Airbnb Power BI Report:

  • Reporting on current listings avaiable in London as of August 2020.
  • Use of visualisation software Power BI to aggregate and report on data