Abstract. Regarding web searches, users have become used to keywordbased search interfaces due to their ease of use. However, this implies a semantic gap between the user's in...
Carlos Bobed, Raquel Trillo, Eduardo Mena, Sergio ...
Wikipedia provides a wealth of knowledge, where the first sentence, infobox (and relevant sentences), and even the entire document of a wiki article could be considered as diverse...
Cross-lingual tasks are especially difficult due to the compounding effect of errors in language processing and errors in machine translation (MT). In this paper, we present an er...
Kristen Parton, Kathleen McKeown, Bob Coyne, Mona ...
Developing better systems for document image analysis requires understanding errors, their sources, and their effects. The interactions between various processing steps are comple...
A degradation model that describes many image degradations produced by desktop scanning is used to study the edge noise that is present in bilevel document images. The standard de...
Craig McGillivary, Chris Hale, Elisa H. Barney Smi...