Sie sind auf Seite 1von 2

1 Introduction to Information Retrieval

Systems

1.1 Definition of Information Retrieval System


1.2 Objectives of Information Retrieval Systems
1.3 Functional Overview
1.4 Relationship to Database Management Systems
1.5 Digital Libraries and Data Warehouses
1.6 Summary

This chapter defines an Information Storage and Retrieval System (called


an Information Retrieval System for brevity) and differentiates between
information retrieval and database management systems. Tied closely to the
definition of an Information Retrieval System are the system objectives. It is
satisfaction of the objectives that drives those areas that receive the most attention
in development. For example, academia pursues all aspects of information
systems, investigating new theories, algorithms and heuristics to advance the
knowledge base. Academia does not worry about response time, required resources
to implement a system to support thousands of users nor operations and
maintenance costs associated with system delivery. On the other hand, commercial
institutions are not always concerned with the optimum theoretical approach, but
the approach that minimizes development costs and increases the salability of their
product. This text considers both view points and technology states. Throughout
this text, information retrieval is viewed from both the theoretical and practical
viewpoint.
The functional view of an Information Retrieval System is introduced to
put into perspective the technical areas discussed in later chapters. As detailed
algorithms and architectures are discussed, they are viewed as subfunctions within
a total system. They are also correlated to the major objective of an Information
Retrieval System which is minimization of human resources required in the
2 Chapter 1

finding of needed information to accomplish a task. As with any discipline,


standard measures are identified to compare the value of different algorithms. In
information systems, precision and recall are the key metrics used in evaluations.
Early introduction of these concepts in this chapter will help the reader in
understanding the utility of the detailed algorithms and theory introduced
throughout this text.
There is a potential for confusion in the understanding of the differences
between Database Management Systems (DBMS) and Information Retrieval
Systems. It is easy to confuse the software that optimizes functional support of
each type of system with actual information or structured data that is being stored
and manipulated. The importance of the differences lies in the inability of a
database management system to provide the functions needed to process
“information.” The opposite, an information system containing structured data,
also suffers major functional deficiencies. These differences are discussed in detail
in Section 1.4.

1.1 Definition of Information Retrieval System

An Information Retrieval System is a system that is capable of storage,


retrieval, and maintenance of information. Information in this context can be
composed of text (including numeric and date data), images, audio, video and
other multi-media objects. Although the form of an object in an Information
Retrieval System is diverse, the text aspect has been the only data type that lent
itself to full functional processing. The other data types have been treated as
highly informative sources, but are primarily linked for retrieval based upon search
of the text. Techniques are beginning to emerge to search these other media types
(e.g., EXCALIBUR’s Visual RetrievalWare, VIRAGE video indexer). The focus
of this book is on research and implementation of search, retrieval and
representation of textual and multimedia sources. Commercial development of
pattern matching against other data types is starting to be a common function
integrated within the total information system. In some systems the text may only
be an identifier to display another associated data type that holds the substantive
information desired by the system’s users (e.g., using closed captioning to locate
video of interest.) The term “user” in this book represents an end user of the
information system who has minimal knowledge of computers and technical fields
in general.
The term “item” is used to represent the smallest complete unit that is
processed and manipulated by the system. The definition of item varies by how a
specific source treats information. A complete document, such as a book,
newspaper or magazine could be an item. At other times each chapter, or article
may be defined as an item. As sources vary and systems include more complex
processing, an item may address even lower levels of abstraction such as a
contiguous passage of text or a paragraph. For readability, throughout this book
the terms “item” and “document” are not in this rigorous definition, but used

Das könnte Ihnen auch gefallen