Abstract— Shared computing utilities allocate compute, network, and storage resources to competing applications on demand. An awareness of the demands and behaviors of the hosted...
The DepAuDE architecture provides middleware to integrate fault tolerance support into distributed embedded automation applications. It allows error recovery to be expressed in te...
Geert Deconinck, Vincenzo De Florio, Ronnie Belman...
Abstract: We present a new approach that uses compilerdirected fault-injection for coverage testing of recovery code in Internet services to evaluate their robustness to operating ...
Chen Fu, Richard P. Martin, Kiran Nagaraja, Thu D....
Our Distributed Telecommunication Management System (DTMS) uses an object-oriented model to describe the networked Voice Communication System (VCS) to be managed. In order to allo...
Traditional problem determination techniques rely on static dependency models that are difficult to generate accurately in today’s large, distributed, and dynamic application e...
Mike Y. Chen, Emre Kiciman, Eugene Fratkin, Armand...