combination. The overall system then selected which retrieval algorithm was giving the best 
performance for the user at each feedback iteration. 
Smeaton, [Sme98], suggests that retrieval strategies which are conceptually independent should work 
better in combination, and that retrieval strategies that work to same general level of effectiveness 
should be suitable for conjunction but again this is not always guaranteed to work. In particular, the 
results presented in [Sme98] indicated that conceptual independence of techniques in retrieving 
different documents did not appear to make a significant difference in experimental setting. However 
support for this claim is to be found in [RLVR02a].  
3.4.3 Multiple feedback algorithms 
For RF, a natural combination of evidence is to combine the results of different feedback methods. This 
could involve either combining the rankings given by different RF methods run on the same original 
query and relevance assessments, or combining the modified queries produced by several RF methods. 
This would be similar in spirit to Belkin et al.'s data fusion approach described in section 3.4. Lee, 
[Lee98], examined the former approach   combining rankings from multiple feedback functions, this 
will be discussed separately in section 3.5. in the discussion of relevance feedback without relevance 
information as this was the main area of Lee's work. 
3.4.5 Summary of combination of evidence for RF 
Combination of evidence has the potential to be a powerful technique for RF. However, the majority of 
techniques attempted have shown that combination of evidence is a very variable technique for initial 
retrieval. It will improve some queries but degrade the performance on others. In addition, it is also 
very difficult to predict what evidence to combine for different collections or queries. Using relevance 
information, section 3.4.1, to guide the combination process does seem to overcome at least some of 
these difficulties. 
3.5 Relevance feedback without relevance information
RF, as described so far, depends on a user providing relevance assessments for a sample of the 
retrieved documents. An alternative approach, known either as pseudo,  blind or ad hoc RF, employs 
RF techniques to automatically improve a ranking before any documents have been shown to the user.  
In this technique the system generates a document ranking from the initial query, selects a small number 
of documents from the top of the ranking, then initiates an iteration of RF by assuming these top ranked 
documents are all relevant (the pseudo relevant documents). The new query, generated by RF, is then 
used to produce a new document ranking which is shown to the user. The basis behind pseudo RF is 
that an iteration of feedback, based on the most similar documents to the user's initial query, will give a 
better initial ranking of documents. 
This technique was first suggested by Croft and Harper, [CH79], as a means of estimating probabilities 
within the probabilistic model for an initial search
23
. It has since been widely investigated as a 
technique for improving document rankings. Croft and Harper also pointed to the fact that this method 
of improving a document ranking can suffer from one major flaw   query drift. Query drift occurs when 
the documents used for RF contain few or no relevant documents. In this case, RF will add terms to the 
query that are poor at detecting relevance, and hence in retrieving relevant documents.  
The pseudo RF technique then, works well for `good' initial queries   those that are good in retrieving 
relevant documents   and poorly for `bad' initial queries   those that are bad at retrieving relevant 
documents. There are two possible solutions to this problem: either improve the initial ranking, so that 
there is a greater likelihood of relevant documents being used to modify the query, or improve the 
detection of relevant features, i.e. develop better RF techniques. 
Mitra et al., [MSB98], have attempted, with some success, to rectify query drift by improving the 
precision at the top of the documents ranking, increasing the likelihood of actual relevant material being 
contained within the set of pseudo relevant documents, and hence decreasing the likelihood of query 
drift. Their experiments used two approaches: a set of Boolean filters and term correlation information 
to prioritise retrieval of documents that covers all aspects of a query. They found that their approaches 
                                                           
23
 As a replacement for the idf  term weighting function which is traditionally used when there is no relevance 
information. 
 30 
<





New Page 1








Home : About Us : Network : Services : Support : FAQ : Control Panel : Order Online : Sitemap : Contact : Terms Of Service

 

Our web partners:  Jsp Web Hosting  Unlimited Web Hosting  Cheapest Web Hosting  Java Web Hosting  Web Templates  Best Web Templates  Web Design Templates  Interland Web Hosting  Cheap Web Hosting  Filemaker Web Hosting  Tomcat Web Hosting  Quality Web Hosting  Best Web Hosting  Mac Web Hosting

 
 

Virtualwebstudio. Business web hosting division of Vision Web Hosting Inc. All rights reserved

UK Web Hosting