In this paper, we present an asynchronous consistent global checkpoint collection algorithm which prevents contention for network storage at the file server and hence reduces the...
Most application level fault tolerance schemes in literature are non-adaptive in the sense that the fault tolerance schemes incorporated in applications are usually designed witho...
Zizhong Chen, Ming Yang, Guillermo A. Francia III,...
This paper presents a new approach for the execution of coarse-grain (tiled) parallel SPMD code for applications derived from the explicit discretization of 2-dimensional PDE prob...
Georgios I. Goumas, Nikolaos Drosinos, Vasileios K...
In today’s large and complex network scenario vulnerability scanners play a major role from security perspective by proactively identifying the known security problems or vulner...
Many large-scale production applications often have very long executions times and require periodic data checkpoints in order to save the state of the computation for program rest...
Wei-keng Liao, Avery Ching, Kenin Coloma, Alok N. ...