A major challenge in document clustering is the extremely high dimensionality. For example, the vocabulary for a document set can easily be thousands of words. On the other hand, ...
The computation of covariance and correlation matrices are critical to many data mining applications and processes. Unfortunately the classical covariance and correlation matrices...
James Chilson, Raymond T. Ng, Alan Wagner, Ruben H...
Machine learning and data mining can be effectively used to model, classify and discover interesting information for a wide variety of data including email. The Email Mining Toolk...
In constrained clustering it is common to model the pairwise constraints as edges on the graph of observations. Using results from graph theory, we analyze such constraint graphs ...
A new method for ensemble generation is presented. It is based on grouping the attributes in dierent subgroups, and to apply, for each group, an axis rotation, using Principal Com...