Data mining is increasingly performed by people who are not computer scientists or professional programmers. It is often done as an iterative process involving multiple ad-hoc tas...
The advent of social network sites in the last years seems to be a trend that will likely continue. What naive technology users may not realize is that the information they provide...
This paper presents an approach to automatically optimizing the retrieval quality of search engines using clickthrough data. Intuitively, a good information retrieval system shoul...
Re-identification is a major privacy threat to public datasets containing individual records. Many privacy protection algorithms rely on generalization and suppression of "qu...
There are many situations in which we have more than one view of a single data source, or in which we have multiple sources of data that are aligned. We would like to be able to bu...