This lead to the notion of relevance feedback (RF):  users marking documents as relevant to their needs 
and presenting this information to the IR system. The system can then use this information 
quantitatively   retrieving more documents like the relevant documents   and qualitatively   retrieving 
documents similar to the relevant ones before other documents. The process of RF is usually presented 
as a cycle of activity: an IR system presents a user with a set of retrieved documents, the user indicates 
those that are relevant and the system uses this information to produce a modified version of the query. 
The modified query is then used to retrieve a new set of documents for presentation to the user. This 
process is known as an iteration of RF.  
The mechanism by which an IR system uses the relevance information given by the user is the main 
focus of this paper. The paper covers several aspects of RF: the representations used in RF, how these 
representations lead to deciding how to modify a query and the role of interaction in RF. Section 2 
presents a brief discussion of the retrieval process as a whole and outlines how RF has been 
incorporated into the major retrieval models. In section 3 we discuss extensions and modifications to 
the traditional models of RF. 
Historically, most RF approaches have been based on automatic techniques for modifying queries. In 
section 4 we summarise these approaches. More recently, a number of researchers have examined the 
role of the user in RF and have presented techniques designed to increase the interaction between the 
user and system in RF. These interactive techniques are the main topic of section 5. In section 6 we 
describe interfaces specifically designed to facilitate RF, in section 7 we outline some of the important 
aspects the user that are important to RF, and we conclude this overview in section 8.  
2 The information retrieval process
The IR process is composed of four main technical stages. The first stage, indexing the document 
collection, during which the documents are prepared for use by an IR system, is discussed in section 
2.1. Document retrieval, the process of selecting which documents to display to the user, is described in 
section 2.2. The presentation of retrieved documents and the evaluation of the retrieval results are 
discussed briefly in sections 2.3 and 2.4 respectively. In the section on retrieval we shall outline the 
basic approaches to RF in the major retrieval models. In section 2.5 we shall summarise the difference 
between these main approaches to RF. 
2.1 Indexing 
For small collections of documents it may be possible for an IR system to assess each document in turn, 
deciding whether or not it is likely to be relevant to a user's query. However, for larger collections, 
especially in interactive systems, this becomes impractical. Hence it is usually necessary to prepare the 
raw document collection into an easily accessible representation; one that can target those documents 
that are most likely to be relevant, for example those documents that contain at least one word that 
appears in the user's query.  
This transformation from a document text to a representation of a text is known as indexing  the 
documents. There are a variety of indexing techniques but the majority rely on selecting good document 
descriptors, such as keywords, or terms, to represent the information content of documents. A  good  
descriptor for IR is a term that helps describe the information content of the document but is also one 
that helps differentiate the document from other documents in the collection. A  good  descriptor, then, 
has a certain discriminatory power
1
. This power of a term in discriminating documents can be used to 
differentiate between relevant and non relevant documents, as will be discussed in the section on 
retrieval. 
Figure 1 outlines the basic steps in transforming a document into an indexed form. The first stage is to 
convert the document text (Document text, Figure 1a) into a stream of terms, typically converting all 
the terms into lower case and removing punctuation characters (Tokenisation, Figure 1b).  
                                                           
1
See [VR79], Chapter 2, for a more detailed explanation of the trade off between the descriptive and 
discriminatory power of terms. 
 2 
<





New Page 1








Home : About Us : Network : Services : Support : FAQ : Control Panel : Order Online : Sitemap : Contact : Terms Of Service

 

Our web partners:  Jsp Web Hosting  Unlimited Web Hosting  Cheapest Web Hosting  Java Web Hosting  Web Templates  Best Web Templates  Web Design Templates  Interland Web Hosting  Cheap Web Hosting  Filemaker Web Hosting  Tomcat Web Hosting  Quality Web Hosting  Best Web Hosting  Mac Web Hosting

 
 

Virtualwebstudio. Business web hosting division of Vision Web Hosting Inc. All rights reserved

UK Web Hosting