Duplicate detection determines different representations of realworld objects in a database. Recent research has considered the use of relationships among object representations t...
Statistical topic models provide a general data-driven framework for automated discovery of high-level knowledge from large collections of text documents. While topic models can p...
Chaitanya Chemudugunta, Padhraic Smyth, Mark Steyv...
This paper presents the second participation of the University of Ottawa group in the Cross-Language Speech Retrieval (CL-SR) task at CLEF 2006. We present the results of the submi...
Abstract. We propose a generic framework and methods for simplification of large networks. The methods can be used to improve the understandability of a given network, to complemen...
How can we cull the facts we need from the overwhelming mass of information and misinformation that is the Web? The TextRunner extraction engine represents one approach, in which ...