Fault tolerance is a very important concern for critical high performance applications using the MPI library. Several protocols provide automatic and transparent fault detection a...
Pierre Lemarinier, Aurelien Bouteiller, Thomas H&e...
The tuple space coordination model is one of the most interesting coordination models for open distributed systems due to its space and time decoupling and its synchronization pow...
In this paper we address the problem of decentralised coordination for agents that must make coordinated decisions over continuously valued control parameters (as is required in m...
Ruben Stranders, Alessandro Farinelli, Alex Rogers...
— Large Clusters, high availability clusters and Grid deployments often suffer from network, node or operating system faults and thus require the use of fault tolerant programmin...
This paper considers the problem of performing decentralised coordination of low-power embedded devices (as is required within many environmental sensing and surveillance applicat...
Alessandro Farinelli, Alex Rogers, Adrian Petcu, N...