Sciweavers

2498 search results - page 370 / 500
» Software Fault Tolerance
Sort
View
SOSP
2007
ACM
16 years 3 months ago
Sinfonia: a new paradigm for building scalable distributed systems
We propose a new paradigm for building scalable distributed systems. Our approach does not require dealing with message-passing protocols—a major complication in existing distri...
Marcos Kawazoe Aguilera, Arif Merchant, Mehul A. S...
VEE
2010
ACM
191views Virtualization» more  VEE 2010»
16 years 1 months ago
Multi-stage replay with crosscut
Deterministic record-replay has many useful applications, ranging from fault tolerance and forensics to reproducing and diagnosing bugs. When choosing a record-replay solution, th...
Jim Chow, Dominic G. Lucchetti, Tal Garfinkel, Geo...
SC
2009
ACM
16 years 1 months ago
Supporting fault-tolerance for time-critical events in distributed environments
In this paper, we consider the problem of supporting fault tolerance for adaptive and time-critical applications in heterogeneous and unreliable grid computing environments. Our g...
Qian Zhu, Gagan Agrawal
SC
2009
ACM
16 years 1 months ago
Flexible cache error protection using an ECC FIFO
We present ECC FIFO, a mechanism enabling two-tiered last-level cache error protection using an arbitrarily strong tier-2 code without increasing on-chip storage. Instead of addin...
Doe Hyun Yoon, Mattan Erez
GLVLSI
2009
IEEE
92views VLSI» more  GLVLSI 2009»
16 years 1 months ago
Online circuit reliability monitoring
In this work we propose an online reliability tracking framework that utilizes a hybrid network of on-chip temperature and delay sensors together with a circuit reliability macrom...
Bin Zhang