Abstract. This paper proposes a novel scheme, named ER-TCP, which transparently masks the failures happened on the server nodes in a cluster from clients at TCP connection granular...
Abstract. In order to construct and deploy massively multiagent systems, we must address one of the fundamental issues of distributed systems, the possibility of partial failures. ...
Dynamic fault-tolerance management (DFTM) was previously introduced as a means of providing environmentand workload-driven adaptation for failure-prone battery powered systems. Th...
A novel framework shows the potential of FPGA-based systems for increasing fault-tolerance and flexibility by placing functionality onto free hardware (HW) or software (SW) resour...
Database systems are a key component behind many of today's computer systems. As a consequence, it is crucial that database systems provide correct and continuous service des...