The complexity of testing properties of monotone and unimodal distributions, when given access only to samples of the distribution, is investigated. Two kinds of sublineartime alg...
Estimating the selectivity of multidimensional range queries over real valued attributes has significant applications in data exploration and database query optimization. In this p...
Dimitrios Gunopulos, George Kollios, Vassilis J. T...
Random sampling is an appealing approach to build synopses of large data streams because random samples can be used for a broad spectrum of analytical tasks. Users are often inter...
Text is ubiquitous and, not surprisingly, many important applications rely on textual data for a variety of tasks. As a notable example, information extraction applications derive...
Panagiotis G. Ipeirotis, Eugene Agichtein, Pranay ...
Suppose we have a large table T of items i, each with a weight wi, e.g., people and their salary. In a general preprocessing step for estimating arbitrary subset sums, we assign e...
Noga Alon, Nick G. Duffield, Carsten Lund, Mikkel ...