Bottom-up segmentation based only on low-level cues is a notoriously difficult problem. This difficulty has lead to recent top-down segmentation algorithms that are based on class-...
In this paper we introduce Ant-Q, a family of algorithms which present many similarities with Q-learning (Watkins, 1989), and which we apply to the solution of symmetric and asymm...
While several hierarchical classification methods have been applied to web content, such techniques invariably rely on a pre-defined taxonomy of documents. We propose a new techni...
Identifying similar keywords, known as broad matches, is an important task in online advertising that has become a standard feature on all major keyword advertising platforms. Eff...
Deduplication is a key operation in integrating data from multiple sources. The main challenge in this task is designing a function that can resolve when a pair of records refer t...