We present the notion of Ranking for evaluation of two-class classifiers. Ranking is based on using the ordering information contained in the output of a scoring model, rather tha...
Many privacy preserving data mining algorithms attempt to selectively hide what database owners consider as sensitive. Specifically, in the association-rules domain, many of these ...
This paper proposes a clustering approach that explores both the content and the structure of XML documents for determining similarity among them. Assuming that the content and th...
Automatic recognition of named entities such as people, places, organizations, books, and movies across the entire web presents a number of challenges, both of scale and scope. Da...
Casey Whitelaw, Alexander Kehlenbeck, Nemanja Petr...
Topic models have been studied extensively in the context of monolingual corpora. Though there are some attempts to mine topical structure from cross-lingual corpora, they require ...