Several important time series data mining problems reduce to the core task of finding approximately repeated subsequences in a longer time series. In an earlier work, we formalize...
Bill Yuan-chi Chiu, Eamonn J. Keogh, Stefano Lonar...
Our aim is to develop new database technologies for the approximate matching of unstructured string data using indexes. We explore the potential of the suffix tree data structure i...
Conditional Random Sampling (CRS) was originally proposed for efficiently computing pairwise (l2, l1) distances, in static, large-scale, and sparse data. This study modifies the o...
The 2-class transduction problem, as formulated by Vapnik [1], involves finding a separating hyperplane for a labelled data set that is also maximally distant from a given set of...
In this paper, we study probabilistic modeling of heterogeneously attributed multi-dimensional arrays. The model can manage the heterogeneity by employing an individual exponential...