Redescription mining is a newly introduced data mining problem that seeks to find subsets of data that afford multiple definitions. It can be viewed as a generalization of associa...
Defining outliers by their distance to neighboring examples is a popular approach to finding unusual examples in a data set. Recently, much work has been conducted with the goal o...
Abstract. The outlier detection problem has important applications in the field of fraud detection, network robustness analysis, and intrusion detection. Most such applications are...
We investigate the impact of time on the predictability of sentiment classification research for models created from web logs. We show that sentiment classifiers are time dependen...
A string similarity join finds similar pairs between two collections of strings. It is an essential operation in many applications, such as data integration and cleaning, and has ...