We show that excluding outliers from the training data significantly improves kNN classifier, which in this case performs about 10% better than the best know method--Centroid-based...
Background: This paper addresses the problem of discovering transcription factor binding sites in heterogeneous sequence data, which includes regulatory sequences of one or more g...
The number of features that can be computed over an image is, for practical purposes, limitless. Unfortunately, the number of features that can be computed and exploited by most c...
Bit arrays, or bitmaps, are used to significantly speed up set operations in several areas, such as data warehousing, information retrieval, and data mining, to cite a few. Howeve...
To deal with the issue of data unbalanced condition among a task of multilingual speech recognition and a phenomenon of pronunciation variations across languages, we propose an ap...