Most image processing algorithms can be parallelized by splitting parallel loops and by using very few communication patterns. Code parallelization using MPI still involves much p...
The dual-cube is a newly proposed topology for interconnection networks, which uses low dimensional hypercubes as building blocks. The primary advantages of the dual-cube over the...
Abstract. In this work an architecture of an automatically tuned linear algebra library proposed in previous works is extended in order to adapt it to platforms where both the CPU ...
We investigate issues related to the probe complexity of quorum systems and their implementation in a dynamic environment. Our contribution is twofold. The first regards the algo...
Compiled communication has recently been proposed to improve communication performance for clusters of workstations. The idea of compiled communication is to apply more aggressive...