Sie sind auf Seite 1von 14

Information Retrieval Systems

Term paper on
Evaluation of Information Retrieval
Systems
Submitted to
Prof. Mandar. R. Mutailkdesai

By
M Ramprabakaran
M.Tech.., ISiM

Page 1 of 14 --- Evaluation of Information Retrieval Systems


Abstract

On the rise of new trends in the World Wide Web, Information retrieval as experiences a
tremendous changes. Evaluation is more important factor in information retrieval. This paper
summarizes evaluation of information retrieval and various approaches to Information retrieval
systems. Limitation is based on the change in the data being searched.

Introduction

Information retrieval means searching information in documents, seeking of information


could be text and images. The term information retrieval was introduced by Calvin Mooers.[1]
The main focus of information retrieval is to provide relevant information to users, on the basis
of the request provided by them. According to “C.Hernon P. McClure., Evaluation and library
decision making”, states that “evaluation is satisfaction of the information demands”. The
satisfaction of the user can be determined by providing relevant data. How to provide a relevant
data it can be illustrated by the following diagram

User provides the information

Information retrieval query’s


the user information

Evaluati Modifying
ng the the query
user data No

Yes

Provide the relevant


information

Fig 1
Stop

Page 2 of 14 --- Evaluation of Information Retrieval Systems


While evaluation, there should make certain changes to give the better performances. On the
basis of six measurable qualities, finally it gives better performance and satisfaction to the user.
The six measurable qualities are:

(i) Coverage: To provide the relevant information, the coverage will include all the
content that is required to provide.
(ii) Time lag: The average time taken by the system to process by providing the seeking
information.
(iii) Form: form represent the way of producing the output.
(iv) Effort: The user requests the information and retrieves his requested answer.
(v) Recall: The recall denotes the fraction of relevant document, which has been
retrieved.
(vi) Precision: The precision denotes the fraction of retrieved, which is relevant.[1]

Evaluation of information retrieval System by binary relevance


Consider the following example “Drive slowly, your family are waiting at home”, the
information is queried as “drive and slowly and your and family and are and waiting and at
and home”. While querying it provide both relevant and non-relevant document. Another
example, if the user is searching for “Ruby”, it is uses the principle of binary relevance. We
cannot predict what the user is seeking for. The user can seek either the programming language
or the red stone. It is rated on the basis of binary relevance, programming language relevant as
1, and red stone relevance as 1.not relevant as 0. It is tabulated and finally it can be
differentiated, the tabulation is shown below:

Page 3 of 14 --- Evaluation of Information Retrieval Systems


Retrieved document For programming language For Red stone “Ruby”
“Ruby”
1 1 0
2 0 0
3 1 1
4 1 0
5 0 1
6 0 0
7 1 1

Tab1
So the binary relevance is better than the random samples. It maximizes the performances. For
effective information retrieval three things are necessary they are:
1. Collection of document.
2. Consists of queries.
3. Relevant judgment i.e. Relevant and non-Relevant.[5]

Measuring Performance in Information retrieval

Consider the following diagram

Page 4 of 14 --- Evaluation of Information Retrieval Systems


Precision

Relevant Document

Number of Documents

Fig 2

Effectiveness can be achieved by recall and precision. In the above diagram, among the number
of documents, on the basis of recall, the fraction of relevant document has been retrieved.
Among the retrieved document, precision can be achieved. The performances can be measured in
four dimensional spaces, i.e.by specifying relevant Vs retrieved.

Page 5 of 14 --- Evaluation of Information Retrieval Systems


Relevant and Relevant and unretrieved
Retrieved

Not – Relevant and Not- Relevant and unretrieved

Retrieved

[1] Fig 3

It can be illustrated by following example: Consider the set of documents F0, F1, F2, F3, F4, F5,
F6, F7, F8, F9 In which query retrieves, the following,

 F3, F4 Relevant and Retrieved let it be A (say),


 F6, F7 Relevant and unretrieved Let it be B (say),
 F1, F2, F8 Not- Relevant and Retrieved Let it be C (say),
 F0, F9 Not- Relevant and unretrieved Let it be D (say).

The Relevant document that is obtained is A+B, The not relevant document obtained is C+D,
The Retrieved document that is obtained is A+C, and the unretrieved document obtained is B+D.

If the relevant document is

A+B = 12,

C+D = 13

If the retrieved document is

A+C = 15,

B+D = 10.

Page 6 of 14 --- Evaluation of Information Retrieval Systems


So the recall is obtained by the formula as A/A+B i.e.8/12 = 0.66 the precision is obtained as
A/A+C i.e. 8/15 = 0.53.

If the recall is increases, i.e. non-relevant document also increases, the precision will be
decreases. [1],[2]

Evaluation of Ranked Retrieval

If you’re searching the information on Google search certain information, among them
almost first 11 results are would be relevant and retrieved i.e. true positive results will be
obtained. So precision increases, recall also increases. The ranked retrieval document is done by
interpolated precision, focus on the highest precision i.e. maximum value of relevant document
that is retrieved. So it is rated on the scale, consider the following the table:

Relevant document that is Precision


retrieved
1 1
2 0.8
3 0.75
4 0.70
5 0.65
6 0.50
7 0.45
8 0.30
9 0.25
10 0.20
11 0.15

Tab 2

In the retrieved document, first document is rated as 1 as per the 11 document are rated. So in
this case at the first precision will be increases at the particular point the precision will be

Page 7 of 14 --- Evaluation of Information Retrieval Systems


decreases and recall will remain the same. It is illustrated by the following graph as shown
below:[7],[4]

1.2

0.8

0.6

0.4

0.2

0
1 2 3 4 5 6 7 8 9 10 11

Fig 4

Evaluation of Unranked Retrievals

At the certain circumstances, precision decreases, recall will be remains the same. After
11 documents precision will be start falling on. Most of the documents will not be relevant,
hence it is considered as false positive. To determine the percentage of false positive it can be
calculated using the formula

2 PR

F= -------------- where p is precision, R is Recall

P+R

Consider an example, if there are 8 relevant and 10 not relevant document, the precision and
recall can be calculated using the formula precision = true positive / (true positive + false
positive), Recall = true positive / (true positive + false negative).

Recall = 8/20= 0.4, precision = 8 / 18 = 0.44

Page 8 of 14 --- Evaluation of Information Retrieval Systems


F = 2*0.4*0.44/ (0.84) = 0.419 = 41.9% [6],[4]

Approaches for evaluation of Information Retrieval System

Cranfield experiment

Cranfield experiment is based on the principle of “indexing language”, i.e. if you’re


searching for desk in Google, only the document contains the desk that will be displayed, so
desktop and other related information regarding to desktop will not be displayed.

The cranfield comprises of two steps, first is cranfield 1 and second is cranfield 2. The cranfield
2 is the subset of cranfield 1.

There are three ways in which the cranfield 2 is processed

(i) Single terms


(ii) Multi terms
(iii) Thesaurus – based controlling language.

In single term, if the user seeks only one word say desk, so the cranfield 2 will retrieve only the
document regarding to desk.( if the document contains desktop, desktop monitor, and other
information regarding to desktop will be deleted) Cranfield 2 will send it to cranfield 1, so
cranfield 1 gives a better performances and effectiveness.

In Multi term, if the user seeks Multi word say Sachin and Dravid, so the cranfield 2 will retrieve
only the document regarding to Sachin and Dravid.(In these case, if the document contains
Sachin, Shewag and Dravid, cranfield 2 will be rejected) cranfield 2 will send it to cranfield 1, so
cranfield 1 gives a better performances and effectiveness.

In thesaurus based controlling language, if the user seeks for “shopal filters are used for edge
detection”, (say) so the cranfield 2 will split the words as shopal and filters and are and used and
for and edge and detection, the cranfield 2 will support up to 32 splitting words. It will seek the

Page 9 of 14 --- Evaluation of Information Retrieval Systems


words according to the user provide the information and reject the unrelated words. Cranfield 2
will send it to cranfield 1, so cranfield 1 gives a better performances and effectiveness.

Cranfield mainly focuses on true positive words, so the recall and precision be
increasing.[1],[2],[8]

STAIRS

The expansion of STAIRS is Storage and Information Retrieval Systems. It is widely


used for seeking any private and public organization. The STAIRS uses database system to store
the documents, the storage are of two ways,

(i) Document Information Retrieval


(ii) Reference Information Retrieval
The document information retrieval uses to store the name, address and phone number, the
reference information retrieval is used for physical location.
Consider an example user is seeking for “International School of Information Management”. The
first content from the page shows the map where it is located. The reference information retrieval
system provides the content. The second content from the page displaying it address, so
document information retrieval system will retrieve the content. [9],[1]

SMART

The expansion of SMART is System for Mechanical analysis and retrieval of text. SMART
uses page ranking algorithms. The Google and yahoo uses SMART algorithm. Each and every
metadata, is ranked, the Google and Yahoo uses hits for every metadata the user is viewing upon.
The maximum number of times the user uses is maximum hits; the lowest time is minimum hits,
if the user is not viewing the hits, the document will be deleted.[2],[3]

Page 10 of 14 --- Evaluation of Information Retrieval Systems


INEX

The expansion of INEX is imitative for the evaluation of XML retrieval, It is mainly used
in cite seer, IEEE Explore, ACM portal. The article is stored by using XML, It has a front matter
and body, the body contains section, the section contains title and paragraph, the abstract and
introduction will be added in the title part, the content will be added in paragraph. The front
matter contains journal title and article title. The INEX mainly focus on relevance and coverage.
The formula for INEX is

Relevant items retrieved = ∑ ( Q rel(c ) cov(c ))

The ( Q rel(c ) cov(c )) doenotes how much document had covered and how much document is
relevant. The coverage is achived by 4 ways

(i) Highly relevance.


(ii) Marginally relevance.
(iii) Fairly relevance.
(iv) Non relevance.[10]

TREC
The expanision of TREC Text Retrieval Conference. It is used for searching the journal
articles in google or yahoo. It will link into the web site like citeseer, IEEE exploxer. Ones it
opens it allow the user to provide the login form, after he can view his article.
The TREC uses Routing and Adhoc. Routing is used to search for the newer documents, it will
send the query i.e the seeking article to ad hoc, ad hoc will found out the releated article and
gives it back to the Routing. The Routing will send the related articles to the search engine i.e
Google, Yahoo.[1],[2]

Cognitive paradigm
The cognative paradigm mainly focus on the user evaluation criteria for information, the
identifying factor includes coverage, requirements/need, ease of access, social presure, time
consideration. [1]

Page 11 of 14 --- Evaluation of Information Retrieval Systems


Conclusion

Evaluation is very important factor for information retrieval. Evaluation is process not
only focusing on precision and recall but also the set of relevant document and retrieved
document. Evaluating the relevant document and retrieved document is very difficult process.
Research works are going upon for evaluating the retrieved document and relevant document.

While retrieving the document it follows queries, the queering should be done twice, it is based
on average of recall and precision, it give a efficient and flexible results, it also avoids
redundancy. [1]

Page 12 of 14 --- Evaluation of Information Retrieval Systems


References

Paper used for References:

1. Evaluation in information retrieval: An overview, by Hamaza Hydri Syed, Department of


Information and communication Technology.

Slide used for References:

2. Evaluation of Evaluation in Information Retrieval – Tefko Saracevic. Historical


Approach to IR Evaluation

Book used for References:

3. The Google way: how one company is revolutionizing management as we know it By


Bernard Girard pg no 13- 16
4. Introduction to Information Retrieval By Christopher D. Manning, Prabhakar Raghavan
& Hinrich Schützed.

URL used for References:

5. http://nlp.stanford.edu/IR-book/html/htmledition/information-retrieval-system-
evaluation-1.html
6. http://nlp.stanford.edu/IR-book/html/htmledition/evaluation-of-unranked-retrieval-sets-
1.html
7. http://nlp.stanford.edu/IR-book/html/htmledition/evaluation-of-ranked-retrieval-results-
1.html
8. http://blog.codalism.com/?p=845
9. http://www.infoplease.com/ce6/sci/A0825197.html
10. http://nlp.stanford.edu/IR-book/html/htmledition/evaluation-of-xml-retrieval-1.html

Page 13 of 14 --- Evaluation of Information Retrieval Systems


Figures used for references

1. Fig 1 - Evaluation in information retrieval: An overview, by Hamaza Hydri Syed,


Department of Information and communication Technology.
2. Fig 2 - - Evaluation in information retrieval: An overview, by Hamaza Hydri Syed,
Department of Information and communication Technology.
3. Fig 3 - - Evaluation in information retrieval: An overview, by Hamaza Hydri Syed,
Department of Information and communication Technology.
4 Fig 4 - http://nlp.stanford.edu/IR-book/html/htmledition/evaluation-of-ranked-retrieval-
results-1.html

Tabular Column used for references:


1. Tab1 –http://nlp.stanford.edu/IR-book/html/htmledition/information-retrieval-system-
evaluation-1.html
2. Tab 2 - http://nlp.stanford.edu/IR-book/html/htmledition/evaluation-of-ranked-retrieval-
results-1.html

Page 14 of 14 --- Evaluation of Information Retrieval Systems

Das könnte Ihnen auch gefallen