We study the performance of three parallel algorithms and their hybrid variants for solving tridiagonal linear systems on a GPU: cyclic reduction (CR), parallel cyclic reduction (...
In the past, efficient parallel algorithms have always been developed specifically for the successive generations of parallel systems (vector machines, shared-memory machines, d...
Abstract. With the help of the FPGA technology, the boarder between hardand software has vanished. It is now possible to develop complex designs and fine grained parallel applicat...
Task graph scheduling has been found effective in performance prediction and optimization of parallel applications. A number of static scheduling algorithms have been proposed for...
Abstract. Recent trend in high-performance computing focuses on networks of workstations (NOWs) as a way ofrealizing cost-effective parallel machines. This has been due to the avai...