Sciweavers

253 search results - page 16 / 51
» Increasing the availability provided by RADIC with low overh...
Sort
View
HIPC
2007
Springer
16 years 7 days ago
A Scalable Asynchronous Replication-Based Strategy for Fault Tolerant MPI Applications
As computational clusters increase in size, their mean-time-to-failure reduces. Typically checkpointing is used to minimize the loss of computation. Most checkpointing techniques, ...
John Paul Walters, Vipin Chaudhary
165
Voted
CORR
2010
Springer
236views Education» more  CORR 2010»
15 years 29 days ago
Precise, Scalable and Online Request Tracing for Multi-tier Services of Black Boxes
As more and more multi-tier services are developed from commercial off-the-shelf components or heterogeneous middleware without source code available, both developers and administr...
Bo Sang, Jianfeng Zhan, Zhihong Zhang, Lei Wang, D...
BROADNETS
2006
IEEE
16 years 4 days ago
Transparent Optimization of Grid Server Selection With Real-Time Passive Network Measurements
Grid services have tremendously simplified the programming challenges in leveraging large-scale distributed comAt the same time, the increased level of abstraction reduces the op...
Marcia Zangrilli, Bruce Lowekamp
FPL
2005
Springer
112views Hardware» more  FPL 2005»
15 years 11 months ago
Defect-Tolerant FPGA Switch Block and Connection Block with Fine-Grain Redundancy for Yield Enhancement
Future process nodes have such small feature sizes that there will be an increase in the number of manufacturing defects per die. For large FPGAs, it will be critical to tolerate ...
Anthony J. Yu, Guy G. Lemieux
PPOPP
2005
ACM
15 years 11 months ago
Trust but verify: monitoring remotely executing programs for progress and correctness
The increased popularity of grid systems and cycle sharing across organizations requires scalable systems that provide facilities to locate resources, to be fair in the use of tho...
Shuo Yang, Ali Raza Butt, Y. Charlie Hu, Samuel P....