Algorithms in distributed information retrieval often rely on accurate knowledge of the size of a collection. The "multiple capture-recapture" method of Shokouhi et al. ...
We develop a method for predicting query performance by computing the relative entropy between a query language model and the corresponding collection language model. The resultin...
CL Research participated in the question answering and novelty tracks in TREC 2004. The Knowledge Management System (KMS), which provides a single interface for question answering...
Near-duplicate keyframes (NDK) play a unique role in large-scale video search, news topic detection and tracking. In this paper, we propose a novel NDK retrieval approach by explo...
We consider the problem of modeling the content structure of texts within a specific domain, in terms of the topics the texts address and the order in which these topics appear. W...