Confidence measures for the generalization error are crucial when small training samples are used to construct classifiers. A common approach is to estimate the generalization err...
This work addresses a novel spatial keyword query called the m-closest keywords (mCK) query. Given a database of spatial objects, each tuple is associated with some descriptive inf...
Abstract breaches. To do so, the data custodian needs to transform its data. To determine the appropriate transforPrivacy preserving data mining so far has mainly mation, there are...
Shaofeng Bu, Laks V. S. Lakshmanan, Raymond T. Ng,...
In this paper we propose a new information-theoretic divisive algorithm for word clustering applied to text classification. In previous work, such "distributional clustering&...
Inderjit S. Dhillon, Subramanyam Mallela, Rahul Ku...
We consider the problem of selecting a subset of m most informative features where m is the number of required features. This feature selection problem is essentially a combinator...
Zenglin Xu, Rong Jin, Jieping Ye, Michael R. Lyu, ...