Pattern discovery in sequences is an important problem in many applications, especially in computational biology and text mining. However, due to the noisy nature of data, the tra...
Starting from an axiomatization of a generalization of Shannon entropy we introduce a set of axioms for a parametric family of distances over sets of partitions of finite sets. T...
Mining data streams is important in both science and commerce. Two major challenges are (1) the data may grow without limit so that it is difficult to retain a long history; and (...
Propositionalization has already been shown to be a particularly promising approach for robustly and effectively handling relational data sets for knowledge discovery. In this pap...
Most data mining operations include an integral search component at their core. For example, the performance of similarity search or classification based on Nearest Neighbors is ...