Duplicate detection determines different representations of realworld objects in a database. Recent research has considered the use of relationships among object representations t...
Web Page segmentation is a crucial step for many applications in Information Retrieval, such as text classification, de-duplication and full-text search. In this paper we describe...
Proactive learning is a generalization of active learning designed to relax unrealistic assumptions and thereby reach practical applications. Active learning seeks to select the m...
In history and the other humanities, events and narrative sequences of events are often of primary interest. Yet while named events sometimes appear as subject headings, systems f...
Social bookmarking is the process through which users share tags for online resources like blogs with others. Such collaborative tags provide valuable metadata for retrieval syste...