Skip to content

vishank94/Movie-Recommendation-System

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

36 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Table of Contents

  1. Introduction
  2. Approach
  3. Dependencies
  4. Running the Code
  5. Directory Structure

Introduction

This repository contains solution to coding challenge recommendation-system.

Approach

  1. This recommendation system uses data from IMDB/MovieLens dataset.
  2. Some concepts used here are - PageRank, Content-Based Recommendation, Collaborative Filtering.
  3. Machine learning terminology you'll come across here - One-Hot Encoding, Cross-Validation, R-squared metric.
  4. The notebook uses and compares Linear Regression and Decision Trees models to predict movie ratings for users.
  5. ML pipeline used here - data wrangling->exploratory data analysis->feature engineering->baseline model->best model.

Dependencies

Python libraries: re, ast, time, heapq, decimal, operator, subprocess, numpy, scipy, pandas, seaborn, networkx, rpy2, itertools, matplotlib, datetime, collections, networkx, sklearn, surprise

R libraries: doMC, Kmisc, igraph, data.table

Running the Code

  1. Jupyter notebook recommendationSystem.ipynb (Python kernel) is the master file.
  2. It makes use of:
  • wd_um_graph.txt generated by weightedDirectedUserGraph.ipynb (R kernel)
  • wu_movie_graph.txt generated by weightedUndirectedMovieGraph.ipynb (R kernel)
  1. The repository directory structure given below must be maintained for the code to run successfully.

Directory Structure

The directory structure for my repo is as follows:

├── README.md 
├── Data
│   └── u.data
│   └── u.genre
│   └── u.info
│   └── u.item
│   └── u.occupation
│   └── u.user
├── Files
|   └── *
├── Scripts
    └── recommendationSystem.ipynb
    └── weightedDirectedUserGraph.ipynb
    └── weightedUndirectedMovieGraph.ipynb