Here is a brief description of the projects I have done.
Term 1 (Machine Learning for Data Scientists) consisted of the following projects:
-
Supervised Learning :
- Find Donors for CharityML with Kaggle: CharityML is a fictitious charity organization that provides financial support. In an effort to improve donor outreach effectiveness, I had to build an algorithm that best identifies potential donors. I evaluated and optimized several different supervised learners (AdaBoost, Decision Tree & Logistic Regression) to determine which algorithm will provide the highest donation yield. (Github Link: https://github.com/savinay/data-science-nanodegree-udacity/tree/master/Supervised%20Learning/project_finding_donors_charity_ml)
-
Deep Learning :
- Create an Image Classifier: In this project, I implemented an image classification application using a convolutional neural network model (VGG16) in PyTorch on a dataset of images. The trained model was used to classify new images. (Github Link: https://github.com/savinay/data-science-nanodegree-udacity/tree/master/Deep%20Learning/image-classification-project/aipnd-project)
-
Unsupervised Learning:
- Creating Customer Segments: I applied unsupervised learning techniques on demographic and spending data for a sample of German households. The data was preprocessed, applied dimensionality reduction techniques, and implemented clustering algorithms to segment customers with the goal of optimizing customer outreach for a mail order company. (Github Link: https://github.com/savinay/data-science-nanodegree-udacity/tree/master/Unsupervised%20Learning/Identifying_Customer_Segments)
Term 2 (Applied Data Science) consisted of the following projects:
-
Write a Data Science Blogpost:
- In this project I chose multiple year survey from stackoverflow dataset to identify the trends in technology industry and tried to answer the following questions with visualizations and data analysis. Growth of Women in Technology? Most popular programming language? Which country has the most developers? Most popular Development Environment? The blogpost can be found here: https://medium.com/@savinaynarendra/how-has-the-technology-industry-changed-over-years-c91cbdd24681. Github Repository link: https://github.com/savinay/stackoverflow-blogpost
-
Natural Language Processing and Machine Learning Pipelines:
- Build Pipelines to Classify Messages: In this project, I built a data pipeline to prepare the message data from major natural disasters around the world. A machine learning pipeline was built to categorize emergency text messages based on the need communicated by the sender. (https://github.com/savinay/Disaster-Response-Pipeline)
-
Recommendation Engines:
- In this project, I built a recommendation engine based on user behavior and social network data, to surface content most likely to be relevant to a user. I used collaborative filtering and matrix factorization techniques to build the recommendation engine
-
Data Capstone Project:
- Apache Spark for Big Data: Used a massive dataset of Spotify data to predict customer churn. In this project, I learnt about feature engineering and apache spark (PySpark). A blogpost related to this project can be found here: https://medium.com/@savinaynarendra/sparkify-project-predicting-user-churn-8d9ee4185274. Github Repository Link: https://github.com/savinay/sparkify-udacity-dsnd