In this paper we propose a new information-theoretic divisive algorithm for word clustering applied to text classification. In previous work, such "distributional clustering&...
Inderjit S. Dhillon, Subramanyam Mallela, Rahul Ku...
Reverse-engineering of gene networks using linear models often results in an underdetermined system because of excessive unknown parameters. In addition, the practical utility of ...
Empirical equations are an important class of regularities that can be discovered in databases. In this paper we concentrate on the role of equations as de nitions of attribute val...
The ability to mine data represented as a graph has become important in several domains for detecting various structural patterns. One important area of data mining is anomaly det...
William Eberle, Lawrence B. Holder, Jeffrey Graves
Active learning (AL) is an increasingly popular strategy for mitigating the amount of labeled data required to train classifiers, thereby reducing annotator effort. We describe ...
Byron C. Wallace, Kevin Small, Carla E. Brodley, T...