Distributed storage systems often use data replication to mask failures and guarantee high data availability. Node failures can be transient or permanent. While the system must ge...
Jing Tian, Zhi Yang, Wei Chen, Ben Y. Zhao, Yafei ...
We propose efficient techniques for processing various TopK count queries on data with noisy duplicates. Our method differs from existing work on duplicate elimination in two sign...
Sunita Sarawagi, Vinay S. Deshpande, Sourabh Kasli...
Abstract. We consider the design of online master algorithms for combining the predictions from a set of experts where the absolute loss of the master is to be close to the absolut...
Jacob Abernethy, John Langford, Manfred K. Warmuth
In their pioneering paper [4], Gallager et al. introduced a distributed algorithm for constructing the minimum-weight spanning tree (MST), many authors have suggested ways to enhan...
We propose a novel hierarchical clustering algorithm for data-sets in which only pairwise distances between the points are provided. The classical Hungarian method is an efficient...