We present a novel approach to embedding data represented by a network into a lowdimensional Euclidean space. Unlike existing methods, the proposed method attempts to minimize an ...
Previous discretization techniques have discretized numeric attributes into disjoint intervals. We argue that this is neither necessary nor appropriate for naive-Bayes classifiers...
This work is concerned with the estimation of a classifier's accuracy. We first review some existing methods for error estimation, focusing on cross-validation and bootstrap,...
We combine the power law of learning and theoretical upper limit predictions to describe the development of text entry rates from users' first contact to asymptotic expert us...
The usual data mining setting uses the full amount of data to derive patterns for different purposes. Taking cues from machine learning techniques, we explore ways to divide the d...