Real datasets are often large enough to necessitate data compression. Traditional `syntactic' data compression methods treat the table as a large byte string and operate at t...
H. V. Jagadish, Raymond T. Ng, Beng Chin Ooi, Anth...
Tree edit distance is one of the most frequently used distance measures for comparing trees. When using the tree edit distance, we need to determine the cost of each operation, bu...
A spatial outlier is a spatially referenced object whose non-spatial attribute values are significantly different from the values of its neighborhood. Identification of spatial ...
Consensus clustering is the problem of reconciling clustering information about the same data set coming from different sources or from different runs of the same algorithm. Cast ...
One of the primary methods employed by researchers to judge the merits of new heuristics and algorithms is to run them on accepted benchmark test cases and comparing their perform...