Sciweavers

17390 search results - page 41 / 3478
» Distributed Data Clustering
Sort
View
IDA
2009
Springer
15 years 3 months ago
Context-Based Distance Learning for Categorical Data Clustering
Abstract. Clustering data described by categorical attributes is a challenging task in data mining applications. Unlike numerical attributes, it is difficult to define a distance b...
Dino Ienco, Ruggero G. Pensa, Rosa Meo
OSDI
2004
ACM
16 years 6 months ago
MapReduce: Simplified Data Processing on Large Clusters
MapReduce is a programming model and an associated implementation for processing and generating large data sets. Users specify a map function that processes a key/value pair to ge...
Jeffrey Dean, Sanjay Ghemawat
JMLR
2010
175views more  JMLR 2010»
15 years 22 days ago
Hierarchical Convex NMF for Clustering Massive Data
We present an extension of convex-hull non-negative matrix factorization (CH-NMF) which was recently proposed as a large scale variant of convex non-negative matrix factorization ...
Kristian Kersting, Mirwaes Wahabzada, Christian Th...
ICDE
2006
IEEE
165views Database» more  ICDE 2006»
16 years 8 hour ago
Privacy Preserving Clustering on Horizontally Partitioned Data
Data mining has been a popular research area for more than a decade due to its vast spectrum of applications. The power of data mining tools to extract hidden information that can...
Ali Inan, Yücel Saygin, Erkay Savas, Ay&ccedi...
IPPS
2007
IEEE
16 years 8 days ago
Towards A Better Understanding of Workload Dynamics on Data-Intensive Clusters and Grids
This paper presents a comprehensive statistical analysis of workloads collected on data-intensive clusters and Grids. The analysis is conducted at different levels, including Virt...
Hui Li, Lex Wolters