Skip to content

All the Data Science projects I've done, during the course, or after.

Notifications You must be signed in to change notification settings

abinash-behera-016/Data-Science-Projects

Repository files navigation

Data Science Portfolio

Welcome to my Data Science Portfolio on GitHub! This repository contains a variety of projects where I apply machine learning algorithms to solve real-world problems. Each project folder includes relevant datasets in CSV format, R or Python scripts detailing the analysis, and a README explaining the project's context and findings.

Technologies Used

  • R, Python
  • Linear Regression
  • Logistic Regression
  • Decision Trees
  • Random Forest
  • XGBoost
  • Neural Networks
  • Packages for modelling: tree, rpart, randomForest, xgboost, nnet, neuralnet, tseries, forecast, scikit-learn, keras, tensoflow
  • Packages for visualization: ggplot2, matplotlib, seaborn

Feel free to explore the projects and reach out if you have questions or would like to collaborate!

Contact

  • Click here to visit my LinkedIn
  • Click here to mail me

Enjoy exploring my projects, and I appreciate any feedback or contributions to improve my work!

Note : I have often used a function createDummies, given by our instructor during the course, to create dummies out of character columns. This function can take a threshold frequncy value below which an observation won't get it's own dummy. I usually keep the threshold as 2% of total number of rows. You can find it's R script above. Another package fastDummies available on CRAN works great too but does't take threshold frequency as an input, as a result you might end up with loads of new dummies corresponding to observations that rarely occur in a column.