Virtualization provides the possibility of whole machine migration and thus enables a new form of fault tolerance that is completely transparent to applications and operating syst...
Previous ultra-dependable real-time computing architectures have been specialised to meet the requirements of a particular application domain. Over the last two years, a consortiu...
- In this paper, we propose an optimal fault tolerant broadcasting algorithm which requires only n+1 steps for an SIMD hypercube with up to n-1 faulty nodes. The basic idea of the ...
ing Abstraction to Improve Fault Tolerance MIGUEL CASTRO Microsoft Research and RODRIGO RODRIGUES and BARBARA LISKOV MIT Laboratory for Computer Science Software errors are a major...
Massively parallel computing systems are being built with thousands of nodes. Because of the high number of components, it is critical to keep these systems running even in the pre...