A survey on the use of relevance feedback
for information access systems
Ian Ruthven
Department of Computer and Information Sciences
University of Strathclyde, Glasgow, G1 1XH.
Ian.Ruthven@cis.strath.ac.uk
Mounia Lalmas
Department of Computer Science
Queen Mary, University of London, London, E1 4NS.
mounia@dcs.qmul.ac.uk
Abstract
Users of online search engines often find it difficult to express their need for
information in the form of a query. However, if the user can identify examples
of the kind of documents they require then they can employ a technique known
as relevance feedback. Relevance feedback covers a range of techniques
intended to improve a user's query and facilitate retrieval of information
relevant to a user's information need. In this paper we survey relevance
feedback techniques. We study both automatic techniques, in which the system
modifies the user's query, and interactive techniques, in which the user has
control over query modification. We also consider specific interfaces to
relevance feedback systems and characteristics of searchers that can affect the
use and success of relevance feedback systems.
1 Introduction
Information retrieval (IR) systems allow users to access large amounts of electronically stored
information objects [VR79, BYRN99, Bel00]. A user submitting a request to an IR system will receive,
in return, a number of objects relating to her request. These objects may include images, pieces of text,
web pages, segments of video or speech samples.
A number of features distinguish IR systems from other information access tools. For example, an IR
system does not extract information from the objects that it accesses. Neither, typically, does it process
information contained within these objects. This separates IR systems from knowledge based systems
such as expert systems, conceptual graphs or semantic networks. These knowledge based tools depend
heavily on a pre defined representation of a domain, such as medicine or law. This domain knowledge
can be used to manipulate, infer or categorise information for a user. Instead, IR systems are used to
direct the user to objects that may help satisfy a need for information.
The data accessed by IR systems is usually unstructured, or at best semi structured. The requests
submitted to IR systems are generally also unstructured. Whereas a database system will be used to
answer requests such as How many female members of parliament are there in the British
Parliament? or Which British MPs are women? , IR systems will be used to answer requests such as
What are the main causes of the poor representation of women in UK politics? or In what ways are
the British political parties attempting to increase the number of female MPs . IR systems are
intended to deal with requests that do not necessarily specify a unique, objective answer.
The process of IR is, therefore, an inherently uncertain one. Searchers may not have a well developed
idea of what information they are searching for, they may not be able to express their conceptual idea of
what information they want into a suitable query and they may not have a good idea of what
information is available for retrieval. Early in the field, researchers recognised that, although users had
difficulty expressing exactly the information that they required, they could recognise useful information
when they saw it. That is, although searchers may not be able to convert their need for information into
a request, once the system had presented the user with an initial set of documents the user could
indicate those documents that did contain useful information.
1
<
New Page 1
UK Web Hosting