Skip to content

israel-santanna/semantic-clustering

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Semantic Space Clustering Recommender System

A Recommender System based on Semantic Space Clustering.

How to use

The MovieLens 1M database saved in the data directory only contains the movie names, so to gather the plots, genres and reviews you must first extract it from IMDb by running:

python datahandler/imdb_extractor.py

Once fetched the info, you can now run the Recommender System:

python3 main.py

Dependencies

  • Gensim: Library that implements the Paragraph Vector algorithm.
  • IMDbPy: Used for searching on IMDb.
  • ImdbPie: Used for retrieving reviews from IMDb.
  • NumPy: Normal computing.
  • Scikit-Learn: Used to normalize the DensityPeakCluster decision graph, allowing to automatically choose the density and distance threshold.
  • Matplotlib: Used only if you want to plot the clusters for debugging.

Releases

No releases published

Packages

No packages published

Languages