We consider the problem of speeding up Entity Recognition systems that exploit existing large databases of structured entities to improve extraction accuracy. These systems requir...
Correlated or discriminative pattern mining is concerned with finding the highest scoring patterns w.r.t. a correlation measure (such as information gain). By reinterpreting corre...
Decision tree construction is a well studied problem in data mining. Recently, there has been much interest in mining streaming data. Domingos and Hulten have presented a one-pass...
We study the problem of finding an outlier-free subset of a set of points (or a probability distribution) in n-dimensional Euclidean space. As in [BFKV 99], a point x is defined t...
We consider the problem of maintaining approximate counts and quantiles over fixed- and variablesize sliding windows in limited space. For quantiles, we present deterministic algo...