Sciweavers

2423 search results - page 291 / 485
» Outlier detection in performance data of parallel applicatio...
Sort
View
HPDC
1999
IEEE
15 years 11 months ago
Starfish: Fault-Tolerant Dynamic MPI Programs on Clusters of Workstations
This paper reports on the architecture and design of Starfish, an environment for executing dynamic (and static) MPI-2 programs on a cluster of workstations. Starfish is unique in ...
Adnan Agbaria, Roy Friedman
ICPP
1996
IEEE
15 years 10 months ago
MpPVM: A Software System for Non-Dedicated Heterogeneous Computing
This paper presents the design and preliminary implementation of MpPVM, a software system that supports process migration for PVM application programs in a non-dedicated heterogen...
Kasidit Chanchio, Xian-He Sun
KDD
2009
ACM
232views Data Mining» more  KDD 2009»
16 years 7 months ago
Classification of software behaviors for failure detection: a discriminative pattern mining approach
Software is a ubiquitous component of our daily life. We often depend on the correct working of software systems. Due to the difficulty and complexity of software systems, bugs an...
David Lo, Hong Cheng, Jiawei Han, Siau-Cheng Khoo,...
ASPLOS
2012
ACM
14 years 2 months ago
Relyzer: exploiting application-level fault equivalence to analyze application resiliency to transient faults
Future microprocessors need low-cost solutions for reliable operation in the presence of failure-prone devices. A promising approach is to detect hardware faults by deploying low-...
Siva Kumar Sastry Hari, Sarita V. Adve, Helia Naei...
199
Voted
SC
2003
ACM
15 years 12 months ago
Dyn-MPI: Supporting MPI on Non Dedicated Clusters
Distributing data is a fundamental problem in implementing efficient distributed-memory parallel programs. The problem becomes more difficult in environments where the participa...
D. Brent Weatherly, David K. Lowenthal, Mario Naka...