3.4 Combination of evidence in RF 
Many of the RF and retrieval techniques described so far have utilised a single query representation 
compared against a series of single document representations, using one retrieval algorithm. Many 
researchers have argued that better retrieval effectiveness may be gained by exploiting multiple query 
representations, retrieval algorithms or feedback techniques and combining the results of a varied set of 
techniques or representations. The combination of evidence from multiple sources is the topic of this 
section. In particular, we will highlight approaches to multiple query representation, section 3.4.1, 
multiple retrieval algorithms, section 3.4.2 and multiple feedback algorithms, section 3.4.3. 
Before this, it is worth highlighting the two main arguments in favour of combination of evidence for IR 
and RF. Proponents of combining evidence, usually base their motivation on either empirical findings, 
or  theoretical properties of evidence combination. The empirical  evidence includes the fact that 
different retrieval functions or query representations will retrieve different documents, e.g. [Lee98]. A 
combination of query representations may increase the recall of a query, whereas the combination of 
retrieval functions may increase the precision of a search.  
A strong  theoretical basis for combining evidence was provided by Ingwersen, [Ing94, Ing96]. His 
research argues that multiple representations of the same object, for example a query, can provide better 
insight into the object than a single good representation. However, what is important is that the multiple 
sources of evidence must each provide not only a different viewpoint on the object, but that these 
viewpoints must have different cognitive bases.  Here, more evidence alone is not better, what is 
important is the variety of evidence.   This intentional redundancy   multiple descriptions of the same 
object   can help uncover information about the user. Multiple query representations, for example, can 
provide different interpretations of a user s underlying information need, or provide more detail about 
how the user is making relevance assessments. 
3.4.1 Multiple query representations 
Belkin et al., [BKF+95] differentiated between two types of retrieval combination based on multiple 
representations of a query: 
i. query 
combination. In this case the scores for a document are computed directly from query 
document scores, using the same retrieval engine but using different version of the query. 
ii. data 
fusion. If different retrieval systems are used to compute query document similarity 
scores then the scores may not be compatible for combining. For instance, the scores may be 
in a different range or the scores cannot be normalised to give comparable rankings. In this 
case it is necessary to combine evidence from the document rankings rather than document 
query similarity scores. This form of evidence combination is known as data fusion. 
Belkin et al. experimented on both kinds of combination, showing that data fusion generally performed 
less well than query combination approaches. The general trend of the experiments presented in 
[BKF+95] was that combination of query representations can improve retrieval effectiveness but that is 
difficult to determine what are good sources of evidence to combine. Ruthven et al., [RLVR02a], also 
showed similar results for retrieval using a variety of term weighting schemes. Both these experiments 
only looked at initial retrieval, with no RF. 
Ruthven et al.'s experiments were extended in [RLVR02a] to the RF case where they showed that 
relevance information, the relevant documents, could be used to select which weighting schemes should 
be used to weight query terms. That is, it is possible to select, for each query term, how the query term 
should be used to score documents; which weighting schemes are best at indicating relevance for that 
query term. The results from this technique were generally better than the best combination of 
weighting schemes for the collections tested. This shows that selecting evidence for combination, 
through relevance information, can lead to successful combination of evidence. 
Croft and Haines, [HC93], described RF in an alternative probabilistic model, the inference network. 
Inference networks are composed of nodes   representing documents, terms, phrases, etc.   and arcs 
representing the dependencies between the nodes. An example is given in Figure 15. The top nodes, 
labelled  d, represent the documents in the collection. The nodes labelled r are concept recognition 
nodes, these nodes represent the content of the document. The nodes labelled q are query nodes, 
representing elements of the query. The bottom node, labelled I, is the `information need' node. This 
 28 
<





New Page 1








Home : About Us : Network : Services : Support : FAQ : Control Panel : Order Online : Sitemap : Contact : Terms Of Service

 

Our web partners:  Jsp Web Hosting  Unlimited Web Hosting  Cheapest Web Hosting  Java Web Hosting  Web Templates  Best Web Templates  Web Design Templates  Interland Web Hosting  Cheap Web Hosting  Filemaker Web Hosting  Tomcat Web Hosting  Quality Web Hosting  Best Web Hosting  Mac Web Hosting

 
 

Virtualwebstudio. Business web hosting division of Vision Web Hosting Inc. All rights reserved

UK Web Hosting