We study the mining of interesting patterns in the presence of numerical attributes. Instead of the usual discretization methods, we propose the use of rank based measures to scor...
Collaborative recommender systems are highly vulnerable to attack. Attackers can use automated means to inject a large number of biased profiles into such a system, resulting in r...
Robin D. Burke, Bamshad Mobasher, Chad Williams, R...
In this work we focus on the problem of frequent itemset mining on large, out-of-core data sets. After presenting a characterization of existing out-of-core frequent itemset minin...
Often the best performing supervised learning models are ensembles of hundreds or thousands of base-level classifiers. Unfortunately, the space required to store this many classif...
Cristian Bucila, Rich Caruana, Alexandru Niculescu...
The output of a data mining algorithm is only as good as its inputs, and individuals are often unwilling to provide accurate data about sensitive topics such as medical history an...
This paper describes a novel classification method for computer aided detection (CAD) that identifies structures of interest from medical images. CAD problems are challenging larg...
The goal of entity resolution is to reconcile database references corresponding to the same real-world entities. Given the abundance of publicly available databases where entities...
Indrajit Bhattacharya, Lise Getoor, Louis Licamele
Finding patterns of social interaction within a population has wide-ranging applications including: disease modeling, cultural and information transmission, and behavioral ecology...
The top web search result is crucial for user satisfaction with the web search experience. We argue that the importance of the relevance at the top position necessitates special h...
Privacy preserving data processing has become an important topic recently because of advances in hardware technology which have lead to widespread proliferation of demographic and...