Histograms are used to summarize the contents of relations into a number of buckets for the estimation of query result sizes. Several techniques (e.g., MaxDiff and V-Optimal) have ...
Francesco Buccafurri, Gianluca Lax, Domenico SaccÃ...
We present a highly accurate method for classifying web pages based on link percentage, which is the percentage of text characters that are parts of links normalized by the number...
Given two sets of points P and Q, a group nearest neighbor (GNN) query retrieves the point(s) of P with the smallest sum of distances to all points in Q. Consider, for instance, t...
We consider the problems of computing aggregation queries in temporal databases, and of maintaining materialized temporal aggregate views efficiently. The latter problem is partic...
We propose a new algorithm for dimensionality reduction and unsupervised text classification. We use mixture models as underlying process of generating corpus and utilize a novel,...