ABSTRACT: OCR is an error-prone process. It is time-consuming and expensive to manually proofread OCR results. The errors remaining in OCRed texts can cause serious problems in rea...
Recognizing that information from different sources refers to the same (real world) entity is a crucial challenge in instance-level information integration, as it is a pre-requisi...
Paolo Bouquet, Heiko Stoermer, Claudia Nieder&eacu...
Sixearch.org is a peer application for social, distributed, adaptive Web search, which integrates the Sixearch.org protocol, a topical crawler, a document indexing system, a retri...
The World Wide Web lacks support for explaining information provenance. When web applications return results, many users do not know what information sources were used, when they ...
In image retrieval, most existing approaches that incorporate local features produce high dimensional vectors, which lead to a high computational and data storage cost. Moreover, ...