To improve software productivity, when constructing new software systems, developers often reuse existing class libraries or frameworks by invoking their APIs. Those APIs, however...
We propose and test an objective criterion for evaluation of clustering performance: How well does a clustering algorithm run on unlabeled data aid a classification algorithm? The...
We give a permutation approach to validation (estimation of out-sample error). One typical use of validation is model selection. We establish the legitimacy of the proposed permut...
Recently a number of modeling techniques have been developed for data mining and machine learning in relational and network domains where the instances are not independent and ide...
Jennifer Neville, Brian Gallagher, Tina Eliassi-Ra...
Data mining is a useful decision support technique that can be used to discover production rules in warehouses or corporate data. Data mining research has made much effort to appl...