Sie sind auf Seite 1von 2

CHAPTER I Background

1.1 Introduction Keyword extraction is the process of extracting a few salient words from a certain text and using the words to summarize the content. This task has been widely studied for a long time in the natural language processing communities, because it is important for many text applications, such as document retrieval. Domain-specific keyword extraction came into sight when researchers found out fully exploiting domain-specific information can greatly improve the performance of this task. For efficient keyword extraction process the weight of the keyword is an important matter. High weighted keyword represents the relevance of the document. In case of domain repository it is more beneficial. But it has been pointed that the structural information of term frequency and inverse document frequency has not been considered so far in the previous researches. As a result keyword extraction has not been totally optimal. We know, term frequency is the estimation of a terms contribution within a local document and inverse document frequency is the estimation of global contribution of a term within the corpus as it reduces the localized effect from the terms score. In information science, ontology is a formal representation of knowledge as a set of concept within a domain and the relationship between those concepts. In this paper the proposed solution uses structural information of term frequency and inverse document frequency of the weighted keyword and formulates a semantic relationship among those keywords by building ontology based on WordNet.

1.2 Motivation Keyword extraction has become quite tough nowadays as the amount of information is becoming huge and it needs very efficient algorithms which will retrieve keyword without delay maintaining precision & recall to a satisfactory level. So, the future prospect of this research field is quite bright and it demands for more efficient & brilliant ideas for developing exact solution to human needs for information. So we are motivated 1

to study in this field. Several extracting algorithms have been developed so far with a view to address different level of problems in Information retrieval which are becoming more complex day by day. However, we have observed that there have been few approaches so far towards keyword extraction in the field of domain repository. As Web content and web pages are increasing day by day the concept of domain specific keyword extraction has become an emerging fact for effective and efficient information retrieval. Domainspecific keyword extraction paid into attention when researchers discovered fully exploiting domain-specific information can greatly improve the performance of this task. So we have motivated to research in this section. 1.3 Objectives We have planned and also have proposed some solution to fulfill the following objectives. To use the semantic similarity among the weighted keywords in domain specific extraction for weighted sum method. To develop a method that exploits the structural information for computing term frequency and inverse document frequency of the keyword

Das könnte Ihnen auch gefallen