Protecting data privacy is an important problem in microdata distribution. Anonymization algorithms typically aim to protect individual privacy, with minimal impact on the quality...
Kristen LeFevre, David J. DeWitt, Raghu Ramakrishn...
In this paper, we investigate how deviation in evaluation activities may reveal bias on the part of reviewers and controversy on the part of evaluated objects. We focus on a `data...
We study the problem of private classification using kernel methods. More specifically, we propose private protocols implementing the Kernel Adatron and Kernel Perceptron learning ...
We introduce a new EM framework in which it is possible not only to optimize the model parameters but also the number of model components. A key feature of our approach is that we...
In this paper, we consider the problem of identifying and segmenting topically cohesive regions in the URL tree of a large website. Each page of the website is assumed to have a t...
In this paper, we consider the evolution of structure within large online social networks. We present a series of measurements of two such networks, together comprising in excess ...
Measuring distance or some other form of proximity between objects is a standard data mining tool. Connection subgraphs were recently proposed as a way to demonstrate proximity be...
Many applications in text processing require significant human effort for either labeling large document collections (when learning statistical models) or extrapolating rules from...
In this paper we present a new approach to mining binary data. We treat each binary feature (item) as a means of distinguishing two sets of examples. Our interest is in selecting ...