Abstract-- We investigate the problem of clustering on distributed data streams. In particular, we consider the k-median clustering on stream data arriving at distributed sites whi...
The All Nearest Neighbor (ANN) operation is a commonly used primitive for analyzing large multi-dimensional datasets. Since computing ANN is very expensive, in previous works R*-t...
There are two major challenges for a high-performance remote-sensing database. First, it must provide low-latency retrieval of very large volumes of spatio-temporaldata. This requ...
Analyzing, visualizing, and illustrating changes within time-varying volumetric data is challenging due to the dynamic changes occurring between timesteps. The changes and variatio...
Classification of real-world data poses a number of challenging problems. Mismatch between classifier models and true data distributions on one hand and the use of approximate inf...