An increasing number of applications are being developed using distributed object computing (DOC) middleware, such as CORBA. Many of these applications require the underlying midd...
Aniruddha S. Gokhale, Balachandran Natarajan, Doug...
Self-stabilization is a versatile approach to fault-tolerance since it permits a distributed system to recover from any transient fault that arbitrarily corrupts the contents of a...
This paper presents a method for analyzing the survivability of distributed network systems and an example of its application. Survivability is the capability of a system to fulfi...
Robert J. Ellison, Richard C. Linger, Thomas A. Lo...
Tolerating defects and fabrication variations will be critical in any system made with devices that have nanometer feature sizes. This paper considers how fabrication variations a...
Michael T. Niemier, Michael Crocker, Xiaobo Sharon...
Abstract--We present a fault tolerant task pool execution environment that is capable of performing fine-grain selective restart using a lightweight, distributed task completion tr...
James Dinan, Arjun Singri, P. Sadayappan, Sriram K...