We present cluster-based publish/subscribe, a novel architecture that is not only resilient to event broker failures but also provides load balancing and fast event dissemination ...
In this paper, we present a new fault tolerance system called DejaVu for transparent and automatic checkpointing, migration, and recovery of parallel and distributed applications....
Joseph F. Ruscio, Michael A. Heffner, Srinidhi Var...
Abstract. Sensor relocation protocols can be employed as fault tolerance approach to offset the coverage loss caused by node failures. We introduce a novel localized structure, in...
— Many distributed systems may be limited in their performance by the number of transactions they are able to support per unit of time. In order to achieve fault tolerance and to...
Tal Anker, Danny Dolev, G. Greenman, I. Shnaiderma...
The Grid provides infrastructure that allows an arbitrary application to be executed on a range of different computational resources. When input files are very large, or when faul...