Sciweavers

1024 search results - page 87 / 205
» Fault Tolerance in Decentralized Systems
Sort
View
IPPS
1998
IEEE
15 years 10 months ago
Fault-Tolerant Message Routing for Multiprocessors
In this paper the problem of fault-tolerant message routing in two-dimensional meshes, with each inner node having 4 neighbors, is investigated. It is assumed that some nodes/links...
Lev Zakrevski, Mark G. Karpovsky
ISCAPDCS
2001
15 years 7 months ago
Tolerating Transient Faults through an Instruction Reissue Mechanism
In this paper, we propose a fault-tolerant mechanism for microprocessors, which detects transient faults and recovers from them. There are two driving force to investigate fault-t...
Toshinori Sato, Itsujiro Arita
SOSP
2009
ACM
16 years 3 months ago
Upright cluster services
The UpRight library seeks to make Byzantine fault tolerance (BFT) a simple and viable alternative to crash fault tolerance for a range of cluster services. We demonstrate UpRight ...
Allen Clement, Manos Kapritsos, Sangmin Lee, Yang ...
ICS
2011
Tsinghua U.
14 years 9 months ago
High performance linpack benchmark: a fault tolerant implementation without checkpointing
The probability that a failure will occur before the end of the computation increases as the number of processors used in a high performance computing application increases. For l...
Teresa Davies, Christer Karlsson, Hui Liu, Chong D...
CLUSTER
2004
IEEE
15 years 10 months ago
Improved message logging versus improved coordinated checkpointing for fault tolerant MPI
Fault tolerance is a very important concern for critical high performance applications using the MPI library. Several protocols provide automatic and transparent fault detection a...
Pierre Lemarinier, Aurelien Bouteiller, Thomas H&e...