Abstract. Most parallel systems on which MPI is used are now hierarchical: some processors are much closer to others in terms of interconnect performance. One of the most common su...
Hao Zhu, David Goodell, William Gropp, Rajeev Thak...
Overlapping computation with communication is a key technique to conceal the effect of communication latency on the performance of parallel applications. MPI is a widely used mess...
The emergence of heterogeneous many core architectures presents a unique opportunity for delivering order of magnitude performance increases to high performance applications by ma...
We present a load-balancing technique that exploits the temporal coherence, among successive computation phases, in mesh-like computations to be mapped on a cluster of processors....
Biagio Cosenza, Gennaro Cordasco, Rosario De Chiar...
Code model checking of software components suffers from the well-known problem of state explosion when applied to highly parallel components, despite the fact that a single compon...