Skip to content

Project 1: ๐ŸŽฌ๐Ÿฟ Movie-Recommendation-System, Project 2: ๐Ÿ“ฐ๐Ÿ”Fake News Detection System

Notifications You must be signed in to change notification settings

Sitaras/Data-Mining

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

8 Commits
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

Data-Mining

Project 1 - Movie Recommendations

The dataset of this project contains Netflix movies.

Part 1

In the first part of the project we observe the dataset and produce statistics about the content of the dataset. Some of the statistics are:

  • Number of movies/series.
  • Country with the most content.
  • Year with the most content.
  • The popularity of each genre for every country.

Part 2

We implement a recommendation system in order to recomend similar movies to a given movie. In order to represent each movie we tried the two following representations:

In order to compute the similarity between the repsesentations we used:

PS: if the notebook cannot be opened on github, you can view it via the Jupiter nbviewer:

Project 2 - Fake/True News Classification

Given a dataset with news articles we should train a model that classifies each article as fake or True. We try different ways to represent the text of each article, such as:

  • Bag Of Words
  • TF-IDF
  • Word2Vec

Also, we use different models in order to compare their performance. The models that we used are:

  • Logistic Regression
  • Naive Bayes
  • Support Vector Machines (SVM)
  • Random Forest

Finaly we compare the performance between every combination of representation/model.