We investigate to what extent people making relevance judgements for a reusable IR test collection are exchangeable. We consider three classes of judge: "gold standard" ...
Peter Bailey, Nick Craswell, Ian Soboroff, Paul Th...
Negative relevance feedback is a special case of relevance feedback where we do not have any positive example; this often happens when the topic is difficult and the search result...
We propose a mathematical framework for query selection as a mechanism for reducing the cost of constructing information retrieval test collections. In particular, our mathematica...
Mehdi Hosseini, Ingemar J. Cox, Natasa Milic-Frayl...
This paper describes a method, using Genetic Programming, to automatically determine term weighting schemes for the vector space model. Based on a set of queries and their human de...
This paper examines whether the Cranfield evaluation methodology is robust to gross violations of the completeness assumption (i.e., the assumption that all relevant documents wi...