Using Data Manipulation, Cleaning and Visualization Techniques on Movie Ranking Dataset from GroupLens Research
In this script various data analysis techniques used to analize data on movie rankings. This dataset gathered from GroupLens Research and it can be accessed from here.
This dataset (ml-latest-small) describes 5-star rating and free-text tagging activity from MovieLens, a movie recommendation service. It contains 100836 ratings and 3683 tag applications across 9742 movies. These data were created by 610 users between March 29, 1996 and September 24, 2018. This dataset was generated on September 26, 2018.
Users were selected at random for inclusion. All selected users had rated at least 20 movies. No demographic information is included. Each user is represented by an id, and no other information is provided.
The data are contained in the files links.csv
, movies.csv
, ratings.csv
and tags.csv
. More details about the contents and use of all these files follows.
This is a development dataset. As such, it may change over time and is not an appropriate dataset for shared research results. See available benchmark datasets if that is your intent.
This and other GroupLens data sets are publicly available for download at http://grouplens.org/datasets/. The dataset includes recall information related to specific NHTSA campaigns. The dataset has 4 different recall type: tires, vehicles, car seats, and equipment. The earliest campaign data is from 1966. The dataset gathered on 8th September 2021.
- Data Cleaning and Manipulation
- Total Count of Each Rating Value
- Which Genres Get Better Ratings
- Rating Count by Each Year
- Which Tags Used Most