Sie sind auf Seite 1von 2

Introduction

This project intends to use Association Rule Mining in order to recommend similar movies
based on the movie selected by the target using a hybrid recommendation system.

Need for the Project:


In the past several years, there has been a surge in the public adoption of Over-the-top media
services (OTT) like Netflix, Amazon Prime, Hotstar etc. Customers have an ever increasing
appetite for consumption of media. There is an abundance of content creators and platforms
for the content to be broadcasted on. But one of the main problems associated with these
providers is their recommendation system. Many customers feel that the recommendation
system or cataloguing of the content is not aligned with their interests. This leads to
discontent in terms of convenience of finding the right content for them after watching
something that they liked.

Project Aim:
This project aims to act as a building base for creating a robust recommendation system that
allows the user to have a seamless viewing experience.

Purpose of Report:
The purpose of this report is to highlight the methods used while creating this
recommendation system in terms of the kind of database used, the pre-processing or data
cleaning required in order to effectively use the data and to provide recommendations based
on the movie selected by the user.

Limitations of Report:
Model accuracy will not be a part of this report due to the fact that movie recommendations
are a matter of taste and it is difficult to determine the accuracy of the model in numerical
terms without conducting surveys or taking feedback for the same.
Project Objective
To act as base for developing a robust recommendation system for movies using Association Rule
Mining.

Methodology
Data Source:
GroupLens Research has collected and made available rating data sets from the MovieLens
web site (http://movielens.org). The data sets were collected over various periods of time,
depending on the size of the set.
The data set used for the purpose of the project was MovieLens Latest Datasets – Full
(https://grouplens.org/datasets/movielens/latest/)
Key Features:

 Contains 27753444 ratings and 1108997 tag applications across 58098 movies
 Created by 283228 users between January 09, 1995 and September 26, 2018
 Users were selected at random for inclusion
 All selected users had rated at least 1 movie
 No demographic information is included
 Each user is represented by an id, and no other information is provided
The rationale for using this dataset is that it is a dynamic dataset which is constantly updated.
This ensures that even the very latest movie ratings recorded in the website are captured by
this dataset. This is useful for recommendation systems as OTT catalogues are constantly
updated with new movies and any dataset used for recommendation should be able to reflect
even the latest relevant additions.
Nature of Data:
The data primarily used for this analysis is the genre or tags for each movie, therefore the
data is nominal in nature.

Das könnte Ihnen auch gefallen