Recent work in deduplication has shown that collective deduplication of different attribute types can improve performance. But although these techniques cluster the attributes col...
In this paper we describe the system we developed for taking part in monolingual Spanish and English tasks at ResPubliQA 2009. Our system was composed by an IR phase focused on im...
This paper1 presents an empirical approach to mining parallel corpora. Conventional approaches use a readily available collection of comparable, nonparallel corpora to extract par...
Abstract We introduce OCELOT, a prototype system for automatically generating the “gist” of a web page by summarizing it. Although most text summarization research to date has ...
Strategic business decision making involves the analysis of market forecasts. Today, the identification and aggregation of relevant market statements is done by human experts, oft...
Henning Wachsmuth, Peter Prettenhofer, Benno Stein