Implementations of map-reduce are being used to perform many operations on very large data. We examine strategies for joining several relations in the map-reduce environment. Our ...
When classifying high-dimensional sequence data, traditional methods (e.g., HMMs, CRFs) may require large amounts of training data to avoid overfitting. In such cases dimensional...
We study the performance issue of the “iterative” record linkage (RL) problem, where match and merge operations may occur together in iterations until convergence emerges. We ...
This paper addresses lossy transmission of a common source over a broadcast channel when there is correlated side information at the receivers. The quadratic Gaussian and binary H...
The benefits of virtualized IT environments, such as compute clouds, have drawn interested enterprises to migrate their applications onto new platforms to gain the advantages of ...
Xiaoning Ding, Hai Huang, Yaoping Ruan, Anees Shai...