18
16
14
12
Freezing
Average  10
Test/control
precision 8
Residual (removal)
6
Residual (no removal)
4
2
0
0
1
2
3
4
Feedback iterations
Figure 13: Average precision over 4 iterations of feedback 
2.5 Summary of RF 
In this section we shall summarise the main points from the previous sections and outline some of the 
major issues in the core RF models. In section 2.5.1 we shall summarise the comparison between 
Boolean and best match models, in section 2.5.2 we shall compare the types of best match model, and 
in section 2.5.3 we shall compare the two main components of RF   query term reweighting and query 
expansion. 
2.5.1 Boolean vs Best match 
Although Boolean models are still popular and have strong advocates, e.g. [FST+99], in general there 
are many advantages to best match models over exact match models. The first advantage is that the 
user does not need to generate a query expression in the same way as with the Boolean model. Instead 
they can enter a natural language expression. This means that users can initiate retrieval sessions 
without knowledge of the collection, previous searching experience or experience in creating Boolean 
queries. 
A second difference is that ranking documents allows the users to interact in a more meaningful fashion 
with the system, [Beau97]; documents are presented in order of match and documents are not excluded 
if they miss out elements of the query.  
Thirdly the system can automatically alter a query through RF. The main strength of best match models 
is that they allow for iterative improvement, often using similar techniques to retrieve documents as to 
modify queries. The strength of ranking models for RF is that, after initial querying, the user can 
interact without further describing the information for which they are searching. The RF algorithms 
discussed in the main body of this paper deal almost exclusively with best match algorithms. In the next 
section we shall look at the relative performance of the best match models discussed previously. 
2.5.2 Relative performance of best match models 
In [SB90] Salton and Buckley investigated the relative performance of 12 feedback algorithms on six 
standard test collections
18
. Several of the feedback algorithms (Ide dec hi, F4, Rocchio, and three 
versions of Rocchio with scaling factors for query, relevant and non relevant set) have already been 
discussed.  
A further version of the Ide scheme was used, the Ide regular scheme, [IdS71], which uses all 
retrieved, non relevant documents. The Ide regular is based on the Rocchio formula but omits the 
                                                           
18
 CACM, CISI, Cranfield, Inspec, MEDLARS and NPL collections. These are relatively short document 
collections ranging from 1, 033 documents (MEDLARS) to 12, 684 documents (INSPEC). 
 20 
<





New Page 1








Home : About Us : Network : Services : Support : FAQ : Control Panel : Order Online : Sitemap : Contact : Terms Of Service

 

Our web partners:  Jsp Web Hosting  Unlimited Web Hosting  Cheapest Web Hosting  Java Web Hosting  Web Templates  Best Web Templates  Web Design Templates  Interland Web Hosting  Cheap Web Hosting  Filemaker Web Hosting  Tomcat Web Hosting  Quality Web Hosting  Best Web Hosting  Mac Web Hosting

 
 

Virtualwebstudio. Business web hosting division of Vision Web Hosting Inc. All rights reserved

UK Web Hosting