We address the issues of discovering significant binary relationships in transaction datasets in a weighted setting. Traditional model of association rule mining is adapted to han...
Abstract. Individual privacy will be at risk if a published data set is not properly de-identified. k-anonymity is a major technique to de-identify a data set. A more general view ...
Jiuyong Li, Raymond Chi-Wing Wong, Ada Wai-Chee Fu...
Statistical topic models provide a general data-driven framework for automated discovery of high-level knowledge from large collections of text documents. While topic models can p...
Chaitanya Chemudugunta, Padhraic Smyth, Mark Steyv...
In this paper we describe the design and implementation of a C++ based Common Component Architecture (CCA) framework, XCAT-C++. It can efficiently marshal and unmarshal large data...
Madhusudhan Govindaraju, Michael R. Head, Kenneth ...
We consider the problem of learning a record matching package (classifier) in an active learning setting. In active learning, the learning algorithm picks the set of examples to ...