In this paper we present the design of a modern course in cluster computing and large-scale data processing. The defining differences between this and previously published designs...
Aaron Kimball, Sierra Michels-Slettvet, Christophe...
— Distributed data mining has recently caught a lot of attention as there are many cases where pooling distributed data for mining is probibited, due to either huge data volume o...
Chak-Man Lam, Xiaofeng Zhang, William Kwok-Wai Che...
Motivated by the poor performance (linear complexity) of the EM algorithm in clustering large data sets, and inspired by the successful accelerated versions of related algorithms l...
This paper gives an overview of two middleware systems that have been developed over the last 6 years to address the challenges involved in developing parallel and distributed imp...
We present a novel algorithm called CLICKS, that finds clusters in categorical datasets based on a search for kpartite maximal cliques. Unlike previous methods, CLICKS mines subs...