In this paper we propose a completely unsupervised method for open-domain entity extraction and clustering over query logs. The underlying hypothesis is that classes defined by mi...
We present a divide-and-merge methodology for clustering a set of objects that combines a top-down "divide" phase with a bottom-up "merge" phase. In contrast, ...
David Cheng, Santosh Vempala, Ravi Kannan, Grant W...
Identification of distinct clusters of documents in text collections has traditionally been addressed by making the assumption that the data instances can only be represented by ...
Levent Bolelli, Seyda Ertekin, Ding Zhou, C. Lee G...
: The Web is huge, unstructured and diverse in quality, which makes searching for information difficult. In practice, few of the documents returned by a search engine are valuable ...
Information resources on the Web like videos, images, and documents are increasingly becoming more “social” through user engagement via commenting systems. These commenting sy...