Sciweavers

2716 search results - page 199 / 544
» Integrating Performance Monitoring and Communication in Para...
Sort
View
IPPS
2007
IEEE
16 years 22 days ago
DejaVu: Transparent User-Level Checkpointing, Migration, and Recovery for Distributed Systems
In this paper, we present a new fault tolerance system called DejaVu for transparent and automatic checkpointing, migration, and recovery of parallel and distributed applications....
Joseph F. Ruscio, Michael A. Heffner, Srinidhi Var...
ICPP
2007
IEEE
16 years 23 days ago
Energy-Efficient Scheduling for Parallel Applications Running on Heterogeneous Clusters
High performance clusters have been widely used to provide amazing computing capability for both commercial and scientific applications. However, huge power consumption has preven...
Ziliang Zong, Xiao Qin, Xiaojun Ruan, Kiranmai Bel...
ICPADS
2007
IEEE
16 years 23 days ago
Persistence and communication state transfer in an asynchronous pipe mechanism
Abstract— Emergent wide-area distributed systems like computational grids present opportunities for large scientific applications. On these systems, communication mechanisms hav...
Philip Chan, David Abramson
IPPS
2010
IEEE
15 years 4 months ago
Tile QR factorization with parallel panel processing for multicore architectures
To exploit the potential of multicore architectures, recent dense linear algebra libraries have used tile algorithms, which consist in scheduling a Directed Acyclic Graph (DAG) of...
Bilel Hadri, Hatem Ltaief, Emmanuel Agullo, Jack D...
PVM
2007
Springer
16 years 17 days ago
Revealing the Performance of MPI RMA Implementations
The MPI remote-memory access (RMA) operations provide a different programming model from the regular MPI-1 point-to-point operations. This model is particularly appropriate for ca...
William D. Gropp, Rajeev Thakur