Most clustering algorithms are partitional in nature, assigning each data point to exactly one cluster. However, several real world datasets have inherently overlapping clusters i...
Imbalanced class problems appear in many real applications of classification learning. We propose a novel sampling method to improve bagging for data sets with skewed class distri...
In this paper, we restudy the non-convex data factorization problems (regularized or not, unsupervised or supervised), where the optimization is confined in the nonnegative orthan...
In the last several years, large OLAP databases have become common in a variety of applications such as corporate data warehouses and scientific computing. To support interactive ...
With recent advances in sensory and mobile computing technology, enormous amounts of data about moving objects are being collected. One important application with such data is aut...
Xiaolei Li, Jiawei Han, Sangkyum Kim, Hector Gonza...