This paper presents a study of a distributed cooperation problem under the assumption that processors may not be able to communicate for a prolonged time. The problem for n proces...
Grzegorz Malewicz, Alexander Russell, Alexander A....
In a machine that follows the dynamically trace scheduled VLIW (DTSVLIW) architecture, VLIW instructions are built dynamically through an algorithm that can be implemented in hard...
In this paper, we consider the problem of supporting fault tolerance for adaptive and time-critical applications in heterogeneous and unreliable grid computing environments. Our g...
Abstract. Mixed-parallelism, the combination of data- and taskparallelism, is a powerful way of increasing the scalability of entire classes of parallel applications on platforms c...
In this paper, we provide an overview of Logistical Runtime System (LoRS). LoRS is an integrated ensemble of tools and services that aggregate primitive (best effort, faulty) stor...
James S. Plank, Micah Beck, Jack Dongarra, Richard...