Reliability has become a serious concern as systems embrace nanometer technologies. In this paper, we propose a novel approach for organizing redundancy that provides high degree ...
Fault tolerant distributed protocols typically utilize a homogeneous fault model, either fail-crash or fail-Byzantine, where all processors are assumed to fail in the same manner....
The advent of large scale multi-hop wireless networks highlights problems of fault tolerance and scale in distributed system, motivating designs that autonomously recover from tra...
Abstract. This paper addresses the problems appearing in componentbased development of safety-critical systems. We aim at efficient reasoning about safety at system level while add...
Due to modern technology trends such as decreasing feature sizes and lower voltage levels, fault tolerance is becoming increasingly important in computing systems. Shared memory i...