work well for manually and automatically created filters, however around 25% of the queries still suffer
from query drift.
Buckley et al., [BSA+95], also looked at improving precision at the top of the initial document ranking.
They used massive query expansion (500 terms and ten phrases commonly occurring pairs of words)
from the top 30 retrieved documents. Their experiments produced significantly better results than with
no feedback, particularly with respect to the precision of the new document ranking.
Most other researchers have concentrated on improving the feedback used in the pseudo RF
approaches. Efthimiadis and Biron, [EB94], for example, found in their experiments that standard RF
techniques used in pseudo RF experiments performed only slightly poorer than experiments using RF
from users and with no feedback. However, the actual performance varied according to the algorithm
used to rank terms for query expansion. Robertson et al., [RWJ+95], also found increased performance
over no feedback, especially when using passages rather than the whole document, to select expansion
terms
In [Lee98], Lee proposed an ad hoc RF technique based on multiple RF techniques. The basic
hypothesis is that, as different RF techniques may produce different modified queries, and different
queries will retrieve different documents, then using a combination of RF techniques to modify queries
will retrieve more of the relevant documents. An initial experiment was carried out treating the top 30
documents as relevant and using a vector space retrieval function. This experiment compared the
documents retrieved after performing pseudo retrieval using a Rocchio technique, Ide dec hi, F4, a
variant of F
24
4 , and a simplified version of Fuhr s RPI formula, [FB91], Equation 17.
n
n
p
(1-
q
)
rel
w
nonrel
w
w
i
i
ri
ri
qi
= log
, p
i
=
, q
i
=
q
(
1
-
p
)
n
rel
n
nonrel
i
i
r =1
n=1
Equation 17: Version of RPI used in [Lee98]
This experiment validated Lee's initial hypothesis: different RF techniques retrieved different
documents although the different RF algorithms performed at approximately the same level of retrieval
effectiveness. The similarity of the documents retrieved by each RF algorithm varied according to the
RF technique used (e.g. the two F4 techniques retrieved very similar documents but Rocchio compared
with the modified F4 formula only had about 50% of documents in common). The difference between
the various RF techniques was also reflected in the query terms used to expand the query.
A second experiment combined the rankings, after normalisation of similarity values, obtained from the
different modified query vectors. Combination of the rankings can provide significant improvements in
effectiveness over the single RF methods. However more combination is not always better:
combinations of two or three RF algorithms generally performed better than combinations of four or
five RF algorithms. Given that the algorithms produce different rankings, after new retrieval, one might
expect that the more different are the rankings, the better the combined performance. However, Lee's
experiments did not generally demonstrate this conclusively.
Although the pseudo RF techniques described in this section can improve retrieval performance over
not using pseudo RF, the problem still remains that it is a variable technique: some queries will be
improved, others will be harmed. Several of the authors mentioned indicate that uncovering more
details about the collection statistics, documents being used for RF and query characteristics may be
used to predict which queries should be used for pseudo RF. For example, Lindquist et al., [LGF97]
investigated various parameters for automatic RF using the vector space model and found optimal
performance was gained using between 5 20 documents and 1 20 terms for feedback. They also provide
support for weighting new query terms against original query terms, using within document term
frequency and thresholding the query terms (only performing relevance feedback on queries that have
terms with a high idf value). This leads to the suggestion that certain characteristics of a term may be
good at predicting how the query is likely to improve given expansion by that term, which may be
useful in pseudo feedback.
24
[Rob86], Equation 12,
31
<
New Page 1
UK Web Hosting