Skip to content

🎥 Big data project for college (HHS) period 7

Notifications You must be signed in to change notification settings

timvisee/hhs-p7-movie-recommendation-engine

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

34 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Movie Recommendation Engine

Our try on a movie recommendation engine in Apache Spark using Jupyter Notebook.

Our implementation uses machine learning which is trained by a given data set. Then, the system is able to predict ratings a user will give on movies s/he didn't watch yet.

The script consists of two parts. The first part is the dataset parsing and machine learning training logic.
The second part loops through all users to suggest the top 10 movies for them based on the predicted ratings for that user on movies s/he didn't watch yet.

The engine is available in the MovieRecommendationEngine.ipynb file. Make sure to put the required data files in the same directory as the notebook file.

Requirements

This project has various requirements:

  • Spark with Hadoop, to run the Notebook with the algorithm's code.
  • Data files:
    • movies_full.csv
    • movies_small.csv
    • ratings_full.csv
    • ratings_small.csv

The data files can be fetched from movielens.

About

🎥 Big data project for college (HHS) period 7

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published