Sie sind auf Seite 1von 6

Information Filtering for a Collaborative Learning Environment: A Multiagent Approach

Andr Luis Silva dos Santos, Sofiane Labidi, Zair Abdelouahab Federal University of Maranho (UFMA) Av. dos Portugueses, s/n Campus do Bacanga So Luis MA Brasil Federal Center of Technology and Education of Maranho Av. Getlio Vargas n 04, Monte Castelo So Luis MA Brasil 1 andresantos@cefet-ma.br,2{sofiane, zair}@dee.ufma.br Abstract
A multiagent filtering information system is proposed in this paper. The agents improve the teaching-learning process in a computer-based learning environment by means of information filtering. The system is developed using the PASSI agent development methodology. Furthermore, the paper describes agents specification and implementations for NetClass.. techniques matches documents and profiles by calculating similarities. The main contribution of this paper is the integration of filtering information techniques applied in collaborative learning environments, in this case the NetClass. Specifically, this work presents a hybrid filtering system for improving the learning process providing the best indications for study, complementing the teachers material. Moreover, the hybrid filtering system considers both, individual and group of profiles. This paper is organized as following: Section 2 presents the NetClass Environment. Section 3 describes the classical filtering information systems. Section 4 shows the architecture and how the application work. Section 5 reports the experiment and results. Finally, section 6 presents the conclusion of this paper and future works.

1. Introduction
Collaborative Leraning environments have been addressed by several researchers with the aim of developing interactive learning system. Usually, these systems are based on Vygotsky [6] theory to facilitate knowledge acquisition by students.. NetClass [1], a system developed at UFMA, is considered a collaborative environment. This system can be considered as a computer aided environment that uses Internet/Intranet technologies integrating Intelligent Tutoring Systems and intelligent agents societies. Furthermore, Internet can store a large plethora of information which may cause to students some difficulties in obtaining relevant data for their learning process, and hence increasing the required time for searching. Using Techniques of Filtering Information is one way to solve this problem. The name Information Filtering describes several processes for sending information for people who needs it. This technique may be based on individuals or groups of preferences descriptions which are called Profiles. Profiles can be obtained in two ways: explicit or implicit. Explicit profiles are provided by users whereas implicit profiles are captured automatically by means of, for instance, mining [9]. Normally, filtering

2. Netclass Environment
NetClass is an agent-based cooperative learning environment for distance education. The NetClass agents can be either humans or artificial [8]. The environment integrates students and teachers providing the development of cooperative assignments [1], receiving assistance by teachers or by tutoring systems in a personal and adapted way [2]. Students are divided in different groups, named cooperative areas, in the NetClass environment. The learning process occurs through the cooperation within their own group (inner cooperation) with agents, teachers and others individuals. The cooperation is done between groups (outer cooperation) by means of multimedia resources and network technologies. A group or cooperative area is formed by three or more students that communicate with each other or with the system. Each stage of learning contains several activities that are distributed among students and groups. The

activities are classified into six types in NetClass: group preparation, knowledge presentation, knowledge assimilation, knowledge application, group evaluation and individual evaluation. Each activity has appropriate pedagogical strategies chosen according to the learning model [3]. The NetClass environment is composed of a multiagent architecture: tutor agent, domain agent, learning model agent, strategy agent, search agent and filtering agent. JADE (Java Agent Development) Framework has been used for developing each kind of agent. In this context, this work focuses on filtering agent presented in Section 4.

document representation; matching between profile and document; similarity analysis and filtering information. However, CBF filtering information based on similarity has some limitation such as: users can be interested on items without similarities (overspecialization); the analysis can be narrow by text content; and, it is impossible to evaluate the quality of a written text.

3.3 Collaborative Filtering


Collaborative Filtering (CF) calculates the similarity between users instead of between user documents. This approach is based on exchanging information about previous filtering and common interests. In this context, it is possible to increase the accuracy of a search providing more relevant information for groups of users. Usually, the following activities are done by a CF technique: make users evaluations; build users models; search for users neighbors (same interest) and determine the neighbors; perform a predictive evaluation; finally, make the filtering. The user model is an essential part of filtering information, since its separates users into groups according to pre-defined criteria. The groups or clusters are then used for filtering information. In other words, user models (profiles) have been used for creating user groups (cluster) which are applied for filtering information. The similarity in Collaborative Filtering is calculated through Pearson Coefficient or cosine estimating the neighbor of users [10]. This calculus is done by considering items that are preferred by users. Collaborative-based filtering presents some limits such as: filtering new items and insufficient amount of users. In the first case, a new item should be filtered only after a user evaluation. In the second case, the system has a poor performance because there are no enough users for creating neighbors. Therefore, the information cannot be filtered.

3. Filtering Information System


The main purpose of a Filtering Information System (FIS) is to send suggestions of relevant information based on interests of users. In other words, information in a determined domain should be delivery according to a user profile. The techniques which are used for FIS depend on the item to search: papers, movies, music, furniture, electric devices, etc. The most common techniques [10] are: (i) ContentBased Filtering (CBF) using profiles and document representation; (ii) Collaborative Filtering (CF) based on many users profile. Details of theses techniques are described in the following sections.

3.1. User Profiles


Users profiles intend to return information based on data about users, trying to attend the users needs. Therefore, a profile is a knowledge base that contains relevant information about users, since this information can be used by a software application. According to Rocha [13], there are two kinds of profiles: content-based and collaborative. The search is done by querying a list of terms or keywords in the content-based profile. Patterns of similar users are evaluated into queries in the collaborative profile. Two problems appear when a profile is considered. The first one occurs when an initial profile for a new user must be created, since its profile contains no information. The second one is related to fill the profile information along the time[13].

3.4 Hybrid Filtering


Hybrid Filtering consists on combining resources of both approaches: Collaborative Filtering and ContentBased Filtering. The main objective of this approach is to overcome the limits presented by the above models. A hybrid system has been developed in this work. First, we use content-based filtering to force the initial filtering, then we apply collaborative filtering for building a generic model involving both approaches. The following sections will show that hybrid filtering

3.2 Content-Based Filtering


Basically, Content-Based Filtering (CBF) gathers information analyzing the users behaviors in the past. In other words, CBF filters information according to preferences demonstrated over certain items. CBF performs the following steps for filtering information: define the user profile representation; define the

obtains best results combining the advantages of CBF and CF techniques.

4.2 Modeling and Implementation of System


This system have been modeled using the PASSI (Process for Agents Societies Specification and Implementation) [13] methodology and implemented using Java language. The agents were created by means of Jade framework. Seven agents compose this system: modeler agent, searching agent, retrieving agent, indexing agent, interface agent, filtering agent and evaluation agent. Each one is described below: Searching Agent : It is responsible for searching information (new items) according to a profile or query user. This agent monitors the web information and the NetClass data base information discovering which one can be filtered. Modeling Agent : It builds the user profile, the document profile from the data base and the Internet and it is also responsible for managing it. Retriever Agent : It retrieves de document from the data base or from the internet. Indexing Agent : It is responsible for representing information according to a vector of keywords. Interface Agent: It is responsible for interfacing the communication with others the system and returns information to the users. The collaborative profile is built through this agent. Moreover, this agent was implemented using the Faceade pattern, providing a high level common interface for a set of others interfaces becoming a system or sub-system easier to use. Filtering Agent : filters items using content-based or collaborative techniques. This agent performs the similarity analysis between users or between documents and users. Evaluation Agent : it updates profiles from users and groups. These agents were modeled using the PASSI methodology, as already mentioned. According to PASSI, first step is to describe the domain as presented in Figure 1, where the requirements are presented as Use Cases. All in all, this figure presents a functional view of the system. Each requirement is described by means of scenario construction methods, generating a use case diagram. In order to explain de use case diagram, a sequence diagram is presented, as well. Using the use case diagram is possible to indentify all the required agents. Agents are formed by packages or by one or more use cases. External entities that interact with the system are represented by actors. A communication act is done between agents and actors.

4. Information Filtering Agents


The main purpose of this kind of agents is to provide information for students of Netclass according to some area of interest and some specific disciplines. The system performs four main activities: recovering information from a user query [8], profile-based filtering, information discovering and updating the base of information. Filtering can begin by means of student request or automatically initiated by the system. Filtering Information Systems can attend the information necessity in a long period of time using profiles. Generally, this information is stored in a NetClass data base or in the Internet. In the latter case, the main problem is how these data are stored. The first step towards a profile should be done by the teacher. In other words, the teacher should specify the discipline and its main topics creating the profile in an explicit way. A profile is represented through vectors of keywords such as Pu={area,discipline,topics}. A content index is created using these vectors in NetClass. The first filtering process is done by content-based filtering, because there is no evaluation for these items. A student should grade an item, from zero to five, when he receives it. This grade could be used in both, to update his profile and to use it for similarity matching in collaborative filtering. During the learning process, the tutor agent can require a list of topics for the filtering agent. Then, the filtering agent sends the results to the tutor agent. The tutor agent presents some Internet links to the student. The learning interface allows the students to navigate through the presented links and evaluate each one. This hybrid approach can overcome all boundaries of content-based filtering and collaborative Filtering presented in Section 3. Next sections describe the modeling process and the implementation of the system.

4.1 Data Base System


All filtering items come from two locations: NetClass data base and Internet. The first one is composed by files of different formats, such as: doc, pdf, jpg, gif, etc. The second one is directly indexed by the search agent. A vector of keywords and weights are created in both cases. Each weight is obtained by considering each keyword in the item (TF = Term Frequency) or by Inverse-Document-Frequency.

Then, agents can interact with other agents or actor for achieve its objectives.

Figure 1. Filtering Agent Create.

The creation of a filtering agent, Figure 1 - is made from the extension of the class Agent from JADE Platform.

4.3 The System


The system is modeled using the PASSI (Process for Agents Societies Specification and Implementation) [13] methodology and is implemented using Java language. The agents were created by means of Jade framework. We see in figure 2, the representation of the documents done according to the same vector model. This model suggests an environment where each document is seen as a vector of terms and each term is associated with a degree of importance (weight) of the document; that is, each document has an array of weights associated in the following manner: (t1, w1), (t2, w2 ),...,( tn, wn), where t is the term and w is the weight of the document, which is described in pseudocode of the following: 1. Pick of the collection of documents 2. Definition of a universe of terms. 3. For each document of the collection: a. Remove scores b. Remove up stopwords c. Make stemming d. Estimated the frequency of each tf of the vector of words e. Estimated the idf for each of the vector of words f. Estimated the weight of each of the term g. Make the vector of weights of the document Where, Tf - representation of the number of words that appears in the document Idf - relationship between word and documents of the collection

Figure 2. The Hybrid Filtering

Make the calculation of the weight from the product between tf and idf. A document is represented as follows: Doc [0] = [date, defined, filtering, information, type, typical] , which after calculating the frequency of words in each document according to the vector of the universe of keywords and definition of the following entries are created the vector Doc [0] = [[filtering, 6.0], [information, 7.0], [retrieval, 2.0], [system, 2.0], [text, 2.0]] . The profile of users is created from the content registered by the teacher using the vector model. When the system or the user requests filtering, it is done by calculating the similarity of the representation of the document and the user's profile through cosseno. The pseudo-code below describes how it is done the calculation of similarity: 1. Choose the universe of terms, n = x. 2. Determines entries a. Vector weights of documents b. Vector weight of the profile. 3. Calculate the similarity through cosseno. 4. Create a ranked list of relevant documents.

5. Experiments and Results


The experiments were conducted at Federal Center Technology and Education of Maranho (CEFETMA) in the discipline of Informatics for Environment Sanitation, Autocad for Roads and Geographics Information Systems. All of the experiments were realized during one month in 2007. In this context, groups are created according to the disciplines, students register themselves in the system and the teacher registers all the topics in order to create the first profile. Then, students uses the system to begin the filtering process and the first evaluation. This gives rise to a second profile with collaborative filtering, and then initializing the hybrid filtering. The documents used in the experiment are in doc format to facilitate the representation and indexing of documents in the database of NetClass. Since then the search system in NetClass will search for items, initially seeking the first topic addressed in the discipline creating an index of content.

Figure 3. Domain Diagram Description

Figure 4. Macro Vision of the Prototype

Figure 5. Interface of MAFIS - Items Filtered

This list of ordered items is delivered to the user for the same use and/or, evaluate them to generate a profile collaborative, since the first could only be used for filtering based on content. From this profile, it is created a model of the user (MU) to form a type of group of users (MGU). To achieve a collaborative filtering, it is performed the calculation of similarity between a user with the active x with the cluster of MGU, through a technique of clustering. The KNN algorithm is used for comparison between the user and the model of the user group to classify in a specific group. the calculation of similarity between an active user and model group of users is done to classify it in a particular group (clustering), and thus filter information to the user. A list of items is delivered to the user for the same use and/or to evaluate them to update their profile collaborative and thus create more effective profiles that can describe the real need of the user.

After the first interactions of the students with the system, it filtering is done. As yet there is no evaluation of these items by any of the users of the system. These are compared to the profile of the student through the techniques of FBC and then send alerts to users through a specific interface, Figure 5. These procedures solves the boundaries with content-based filtering and collaborative filtering regarding to: overspecialization, content analysis narrowed to textual information, quality text evaluation for content-based filtering and insufficient number of users.

6. Conclusions
In this paper a multiagent filtering information system is proposed for a computer-based learning environment, the NetClass, wich improves the teaching-learning process. Seven Agents of Filtering for the NetClass have

been specified for serving in the experiment to see whether the use of agents of software can efficiently filter information in the Web and its proper database The study and implementation of the prototype have showed the viability of the use of the filtering techniques to send items The biggest contribution of this work applies to the collaborative Learning Environment with respect to insertion/integration of filtering information techniques, in particular hybrid filtering, to improve the learning through the indication of complementary materials that the professor/tutor can suggest. Another contribution of this work is an attempt to solve the limitations of filtering techniques using a hybrid technique derived from of the same ones. One of the limitations of this work is with respect to the initial filtering if it gives only textual items. As part of the future work, it is intended to use other algorithms and techniques to verify the efficiency of the information filtering techniques as well as the multiagent system.

[6] Vygotsky, L. S. A Formao Social da Mente: o Desenvolvimento dos Processos Psicolgicos Superiores. Editora Martins Fontes, So Paulo. 1998. [7] Masiero, Tiago, Cazella, Silvio Cesar; Reategui, Eliseo; ALVARES, Luis Otavio Campos. Utilizando Sistemas de Recomendao na Criao de Comunidades Virtuais de Aprendizagem. RENOTE. Revista Novas Tecnologias na Educao, , v.4, n.2, Dezembro, 2006. [8] Oliveira, R., Serra JR., G., Labidi, S., Rabelo, W. Recuperao de Informaes Web para um Ambiente de Aprendizagem Computadorizada: uma abordagem multiagente. Dissertao de Mestrado. Coordenao de PsGraduao em Engenhaira de Eletricidade. Universidade Federal do Maranho. 2005. [9] Girardi, Rosario. Marinho, Leandro Balby. Oliveira, Ismenia Ribeiro. A system of agent-based software patterns for user modeling based on usage mining. Interacting with computers 7. Elsevier. 2005. [10] Adomavicius, Gediminas. Tuzhilin, A. Toward next generation of recommender systems: A survey of the stateof-the-art and possible extensions. IEEE Transactions on knowledge and data engineering, vol. 17, no. 6, june 2005 [11] Herlocker, J. (2000) Understanding and Improving Automated Collaborative Filtering Systems, Ph. Dissertation, University of Minnesota. Disponvel em http://web.engr.oregonstate.edu/~herlock/papers.html, acessado em 01 novembro 2005. [12] Cossentino, M., Potts, M. A CASE tool supported methodology for the design of multi-agent systems in Proc. of the 2002 International Conference on Software Engineering Research and Practice (SERP'02) (Las Vegas, USA, June, 2002). [13] Rocha, Catarina Carneiro. RECDOC: Um Sistema de Recomendao para uma biblioteca digital na web. Dissertao de Mestrado. UFRJ. 2003.

Acknowledgments
Financial support of CEFET-MA is gratefully acknowledged.

7. References
[1] Labidi, S., Souza, C., Nascimento, E. NetClass: Cooperative Learner Modeling in a Web-Based Environment. In the 6th Int. Conf. on Computer Based Learning in Science. Proceedings of the 6th Int. Conf. on Computer Based Learning in Science (CBLIS). Nicosia, Cyprus: University of Cyprus, 2003. [2] Labidi, S., Costa, N., Ferreira, J. Modeling of an Authoring Tool for an Inteligent tutoring System. In the Proceedings of the 6th Int. Conf. on Computer Based Learning in Science (CBLIS), 2003, Nicosia. University of Cyprus, 2003. [3] Serra JR., G.; Coutinho, L.; Labidi, S. Formation of Groups for Cooperative Learning: a Genetic Algorithm Approach. In Proceedings of Conference on Computers and Advanced Technology in Education (CATE 2001). Banff, Canada: June 27-29, 2001. [4] Serra JR, G. Agente de Modelagem do Aprendiz para o Ambiente MATHNET de Ensino Cooperativo Computadorizado. So Lus-MA, 2001. Dissertao (Mestrado em Cincia da Computao)-Universidade Federal do Maranho. [5] Belkin, Nicholas J, Croft, W. Bruce. Information Filtering and Information Retrieval: Two sides of the same coin. Comm. ACM. vol. 35, n 12, 1992.

Das könnte Ihnen auch gefallen