Sciweavers

6388 search results - page 842 / 1278
» High Performance Data Mining
Sort
View
196
Voted
SIGMOD
2010
ACM
277views Database» more  SIGMOD 2010»
15 years 11 months ago
A comparison of join algorithms for log processing in MaPreduce
The MapReduce framework is increasingly being used to analyze large volumes of data. One important type of data analysis done with MapReduce is log processing, in which a click-st...
Spyros Blanas, Jignesh M. Patel, Vuk Ercegovac, Ju...
PRL
2010
159views more  PRL 2010»
15 years 5 months ago
Creating diverse nearest-neighbour ensembles using simultaneous metaheuristic feature selection
The nearest-neighbour (1NN) classifier has long been used in pattern recognition, exploratory data analysis, and data mining problems. A vital consideration in obtaining good res...
Muhammad Atif Tahir, Jim E. Smith
ICPP
2009
IEEE
15 years 4 months ago
Heterogeneity-Aware Erasure Codes for Peer-to-Peer Storage Systems
Peer-to-peer (P2P) storage systems rely on data redundancy to obtain high levels of data availability. Among the existing data redundancy schemes, erasure coding is a widely adopte...
Lluis Pamies-Juarez, Pedro García Ló...
ICDE
2012
IEEE
216views Database» more  ICDE 2012»
13 years 9 months ago
Load Balancing in MapReduce Based on Scalable Cardinality Estimates
—MapReduce has emerged as a popular tool for distributed and scalable processing of massive data sets and is increasingly being used in e-science applications. Unfortunately, the...
Benjamin Gufler, Nikolaus Augsten, Angelika Reiser...
SASO
2007
IEEE
16 years 1 months ago
e-SAFE: An Extensible, Secure and Fault Tolerant Storage System
With the rapidly falling price of hardware, and increasingly available bandwidth, the storage technology is seeing a paradigm shift from centralized and managed mode to distribute...
Sandip Agarwala, Arnab Paul, Umakishore Ramachandr...