Abstract— Error control is a major concern in many computer systems, particularly those deployed in critical applications. Experience shows that most malfunctions during system o...
Christopher G. Knight, Adit D. Singh, Victor P. Ne...
This paper describes a powerful, scalable, reconfigurable computer called the PARTS engine. The PARTS engine consists of 16 Xilinx 4025 FPGAs, and 16 one-megabyte SRAMs. The FPGAs...
: The distributed recovery block (DRB) scheme is a widely applicable approach for realizing both hardware and software fault tolerance in real-time distributed and parallel compute...
The Reusable Software Fault Tolerance Testbed ReSoFT was developed to facilitate the development and evaluation of high-assurance systems that require tolerance of both hardware...
Kam S. Tso, Eltefaat Shokri, Roger J. Dziegiel Jr.
Practical clustering algorithms require multiple data scans to achieve convergence. For large databases, these scans become prohibitively expensive. We present a scalable clusteri...