Sie sind auf Seite 1von 3

A Content-Based Textbook Recommender System Using Neural Network

Partha Sarathi Chakraborty Department of Information Technology University Institute of Technology University of Burdwan Burdwan , West Bengal Abstract
Recommender systems have proved really useful in order to handle with the information overload on the Internet. Many web sites attempt to help users by incorporating a recommender system that provides users with a list of items and/or web pages that are likely to interest them. This paper describes a content-based text book recommending system that employs neural network as machine learning algorithm to learn personal preferences of users and provide tailored suggestions. Content-based recommendation systems recommend items based on the content of the items and target users ratings. Two different content-based approaches have been proposed: feature-based and text categorizationbased. Feature-based recommendation systems [6], [7] extract important features from the item descriptions and learn a users profile (classifier) using a set of preclassified (according to the users rating) feature vectors. Text categorization [13] systems learn from thousands of features (words or phrases). Several systems using text categorization (TC) have been developed. They have been applied to recommend WebPages [8], books [9]. In this paper, we apply learning for text categorization to the domain of book recommendation. The machine learning algorithm that is used here is a three-layer fully connected feed forward neural network. The rest of the paper is organized as follows. The related work is mentioned in Section 2. The detail of the proposed system is presented in Section 3. Experimental results have been shown in Section 4. Future works and References have been mentioned in section 5 and 6 respectively.

Key Words- recommender system, text categorization, book recommendation 1. Introduction


In many markets, consumers are faced with a wealth of products and information from which they can choose. To alleviate this problem, many web sites attempt to help users by incorporating a recommender system [1] that provides users with a list of items and/or WebPages that are likely to interest them. Once the user makes her choice, a new list of recommended items is presented. In case of an online bookstore also user has to choose a single book or a few numbers of books from a huge database of books using some searching methods. However, conventional search methods are convenient only when the exact information about the title of the book or ISBN number is available to the user. An online book recommender system plays here an important role to the user in choosing a book or books of his interest. There are two main approaches used in designing recommender systems: collaborative filtering and contentbased. The majority of existing systems are based on collaborative filtering. Collaborative Filtering [2] [3] works by building a database of items with users opinions on them. Then a specific user is matched against this database in order to find her neighbors, those with whom she shares similar tastes. Collaborative Filtering has been used successfully by e-commerce sites and in the area of information filtering [ 4] [ 5].

2. Related work
A number of works have been done in the domain of book recommendation. On-line book stores like Amazon and BarnesAndNoble have popular recommendation services. A content based approach was employed in one of the first book recommending systems [10, 11]. One important work has been done in this domain by Mooney and Roy [9] in their LIBRA system. LIBRA uses a content-based approach for recommending books by applying automated text-categorization methods to semi-structured text extracted from the web. After user rating, the system learns a profile of the user using a Bayesian learning algorithm and produces a ranked list of the most recommended additional titles from the systems catalog.

In our approach we have used a soft computing tool, neural network for recommending books online.

formed by taking only b terms with highest TF x IDF weight. The value of the parameters a and b are determined by performing experiments.

3. Our Approach
3.1 Overview
In our recommender system a user first chooses the name of the subject for which he is searching for a book. The user may also limit the searching by specifying name of an author or name of publishing house. A list of books is presented with a brief description for each one. The user evaluates some of the books he desires by rating them in the scale of 1 to 5. Based on his/her rating a small number of books are recommended by the system.

3.4 Neural recommender


A three-layer fully connected feed-forward neural network is used in our approach for recommending books. The sigmoid function is used as the activation function. The number of input nodes is equal to the dimensionality of the reduced term set. The number of output nodes is equal to 5 as users evaluate books in the scale of 1 to 5. The number of hidden nodes is determined by performing experiments. The choice of the appropriate number is based on the success rate. In our study, the training set consist of the weighted feature vector of the table of content of all the books evaluated by the user associated with the corresponding grades(in the scale of 1 to 5) for each of those books. The neural network is trained using back propagation learning algorithm.

3.2 Data Sets


Book information was collected from the website of Pearson education( www.pearsonhighered.com ). The information regarding title, author, brief description and table of content of 500 books from different subjects was collected from the above site and considered as our dataset.

4. Experimental Results
There are several metrics commonly used in evaluating recommender systems. We will be using precision and recall to evaluate our system. Precision is the percentage of correctly recommended items out of the total number of recommended items. Accuracy is the number of correctly classified items divided by the number of classified items. For our current system precision was calculated as 56% whereas recall was 57.1%. The overall percentage of successful recommendations was 71%

3.3 Text representation


For each book the title, author, price, name of the publishing house, a brief description of the book and table of content(TOC) is stored in the database. The table of content of each book is represented as a Bag of Words(BOW). The collection of TOC of all books is our document set. The term weighting scheme that is used in our approach to automatically assign different weights to the words of TOC over the collection is TF x IDF. A good measurement of the importance of a term in a document set is the product of the term occurrence frequency (TF) and the inverse document frequency (IDF). The inverse document frequency of the i th term is commonly defined as [12]:

5. Future Works
The proposed system has been tested with a small dataset. Rigorous evaluation of the system is required using a large dataset. Currently we are trying to apply genetic algorithm for the neural network weight selection and observing the performance change. For comparative analysis we are also interested in applying self Organizing map in place of feed forward network for our recommender system.

Where N is the number of documents in the document set, and n is the number of documents in which the i th term appears. By this definition, a term that appears in fewer documents will have a higher IDF. The assumption behind this definition is that terms that are concentrated in a few documents are more helpful in distinguishing between documents with different topics. We also use TD x IDF to reduce the dimensionality of the feature space with the help of two parameters a, b. First, those terms are considered with TF x IDF weight greater than the threshold value a. The reduced feature set is then

6. References
[1] P. Resnick and H. R. Varian. Recommender systems. Special issue of Communications of the ACM, pages 5658, March 1997.

[ 2] Resnick, P., Iacovou, N., Suchak, M., Bergstrom, P., Riedl, J.1994. GroupLens: An Open Architecture for Collaborative Filtering of Netnews. In Proceedings of the Computer Supported Collaborative Work Conference. Chapel Hill, NC. [3] Shardanand, U. and Maes, P. 1995. Social Information Filtering: Algorithms for Automating Word of Mouth. In Proceedings of CHI., pp. 210-217. Denver, CO. [4] Schafer B., Konstan J., Riedl J. 1999. Recommender Systems in E-Commerce. In Proceedings of ACM Conference on Electronic Commerce. [5] Sarwar, B.M., Karypis, G., Konstan, J.A. and Riedl, J. 2000a. Application of Dimensionality Reduction in Recommender System A Case Study. In ACM WebKDD 2000 Web Mining for E-commerce Workshop. [6] D. Billsus and M. Pazzani, "A Personal News Agent that Talks, Learns and Explains", Third Intern. Conf on Autonomous Agents (Agents '99), Seattle, Washington, 1999. [7] M. Pazzani, J. Muramatsu, D. Billsus, Syskil & Webert: Identifying Interesting Web Sites, AAAI-96, pp.54-61, 1996. [8] M. Pazzani, J. Muramatsu, D. Billsus, Syskil & Webert: Identifying Interesting Web Sites, AAAI-96, pp.54-61, 1996.

[9] R. J. Mooney, L. Roy, Content-Based Book Recommend-ingUsing Learning for Text Categorization, Fifth ACM Conf. on Digital Libraries, 2000. [10] E. Rich. User modeling via stereotypes. Cognitive Science, 3:329354, 1979. [11] E. Rich. Users are individuals: Individualizing user models. International Journal of Man-Machine Studies, 18:199214, 1983. [12] G. Salton and C. Buckley. Term-weighting approaches in automatic text retrieval. Information Processing and Management, 24(5):513523, 1988. [13] Fabrizio Sebastiani, Machine Learning in Automated Text Categorization, ACM Computing Surveys, 34(1):147, 2002.

Das könnte Ihnen auch gefallen