Principal component analysis (PCA) has been extensively applied in data mining, pattern recognition and information retrieval for unsupervised dimensionality reduction. When label...
Shipeng Yu, Kai Yu, Volker Tresp, Hans-Peter Krieg...
Recent work has shown the feasibility and promise of templateindependent Web data extraction. However, existing approaches use decoupled strategies ? attempting to do data record ...
Jun Zhu, Zaiqing Nie, Ji-Rong Wen, Bo Zhang, Wei-Y...
We introduce a novel framework (BLOSOM) for mining (frequent) boolean expressions over binary-valued datasets. We organize the space of boolean expressions into four categories: p...
Lizhuang Zhao, Mohammed J. Zaki, Naren Ramakrishna...
Previous efforts on event detection from the web have focused primarily on web content and structure data ignoring the rich collection of web log data. In this paper, we propose t...
Qiankun Zhao, Tie-Yan Liu, Sourav S. Bhowmick, Wei...
Classification has been commonly used in many data mining projects in the financial service industry. For instance, to predict collectability of accounts receivable, a binary clas...
In many text classification applications, it is appealing to take every document as a string of characters rather than a bag of words. Previous research studies in this area mostl...
Privacy becomes a more and more serious concern in applications involving microdata. Recently, efficient anonymization has attracted much research work. Most of the previous metho...
Jian Xu, Wei Wang 0009, Jian Pei, Xiaoyuan Wang, B...
K-means is a widely used partitional clustering method. While there are considerable research efforts to characterize the key features of K-means clustering, further investigation...
1 A bridging rule in this paper has its antecedent and action from different conceptual clusters. We first design two algorithms for mining bridging rules between clusters in a dat...