Decision tree induction algorithms scale well to large datasets for their univariate and divide-and-conquer approach. However, they may fail in discovering effective knowledge when...
Giovanni Giuffrida, Wesley W. Chu, Dominique M. Ha...
Background: In biological sequence analysis, position specific scoring matrices (PSSMs) are widely used to represent sequence motifs in nucleotide as well as amino acid sequences....
Michael Beckstette, Robert Homann, Robert Giegeric...
Prototype selection problem consists of reducing the size of databases by removing samples that are considered noisy or not influential on nearest neighbour classification tasks. ...
Given a point set S and an unknown metric d on S, we study the problem of efficiently partitioning S into k clusters while querying few distances between the points. In our model...
Konstantin Voevodski, Maria-Florina Balcan, Heiko ...
In this paper, we study a new type of spatial query, Nearest Surrounder (NS), which searches the nearest surrounding spatial objects around a query point. NS query can be more use...