This section looks at using information from the former group of documents (i.) those that have been
explicitly marked as being non relevant. This form of feedback is called negative relevance feedback.
Negative relevance feedback has generally been regarded as problematic for three main reasons.
i. Implementation. One difficult issue of negative relevance feedback is that it is not clear how
negative information should be handled by the system. A common decision in IR is to remove from the
query those terms that have a negative weight those terms that are better at retrieving non relevant
than relevant documents. Negative feedback can be used to better indicate which terms should receive a
negative weight.
Belkin et al., in a long running study of user's involvement in RF [BCK+96, BCC+97, BCC98,
BCK+99], propose an alternative model. They hypothesise that terms which appear in negatively, as
well as positively, assessed documents may be good query terms. These terms are good in the sense that
they can retrieve relevant documents. However, these terms may appear in the wrong context in the
document, or the document does not discuss them fully or discuss them in the way the user requires. In
their model, what is important is not the distribution of a term between the relevant and non relevant
documents but the context of terms. Terms that appear only in negative documents can be used to
indicate inappropriate contexts, main topics etc. of the useful terms and perhaps this could lead to the
detection of different reasons for non relevance.
Belkin and his colleagues carried out a series of experiments, mainly reported in [BCK+96, BCC+97],
which examined how users utilitised negative feedback. In the experimental system reported in
[BCK+96], subjects could explicitly mark a document relevant or non relevant, and were given
suggestions as to terms that could be added to the user's query. The terms themselves could be
positively or negatively weighted. Although subjects preferred the system that allowed negative and
positive feedback, they did not feel that negative feedback was very useful. Belkin et al. give several
reasons for this, based on subject's comments. Often subjects were concerned that negative relevance
assessments would stop the retrieval of relevant documents. That is, they were concerned that the
system would not understand upon what information the negative decision was based. Similarly,
subjects were concerned that negative RF would lower the rank position of relevant documents.
An additional concern for the subjects was that negative feedback was a more difficult decision to
make. The experimental conditions, in particular the time constraint imposed by the experiments, led
some subjects to feel that negative feedback was too unpredictable to use. Other reasons for the lack of
use of negative RF include the perceived topic dependency of negative RF, that is negative RF is only
appropriate for some topics, the lack of control as to which terms were negatively weighted and
problems relating to word stemming. This latter problem results from the fact that useful and non useful
terms may be stemmed to the same base stem.
One aim of negative feedback that was requested by users, also noted by Sumner et al [SYA+98], was
the suppression of previously seen, non relevant documents. These documents were discarded by the
user but reappeared in the ranking if they matched the new query. A common request by users was that
these documents were not re retrieved.
The experimental results from [BCK+96] were equivocal but hinted at potential improvements when
subjects used a mixture of negative and positive feedback. The experiments reported in [BCC+97]
reiterated many of the conclusions from [BCK+96], namely that although users may use negative
feedback, the gain in performance is not significant, and users are unsure about the process of making
negative assessments. A more positive indication from [BCC+97] is that users' familiarity with negative
feedback may be an important factor in its success: the more familiar a user is with this option, the more
comfortable they are with using it.
ii. Clarity. It may also be difficult to specify under what conditions should a user should consider
and mark a document non relevant. There are many reasons why a document is not considered relevant,
e.g. if the document contains absolutely no relevant information, contains no information related to
what a user is searching for, contains topically related but non relevant information, if the document is
relevant but not relevant enough, and so on. Any of these definitions may apply within or across
searches. The issue here is when should a user mark a document non relevant?
It could be argued that this problem also applied to positive feedback when should a user mark a
document relevant? However, we believe that this issue is more central to negative feedback for two
26
<
New Page 1
UK Web Hosting