Tiling, a key transformation for optimizing programs, has been widely studied in literature. Parameterized tiled code is important for auto-tuning systems since they often execute...
Muthu Manikandan Baskaran, Albert Hartono, Sanket ...
In this paper, we present a methodology for profiling parallel applications executing on the IBM PowerXCell 8i (commonly referred to as the “Cell” processor). Specifically, we...
Hikmet Dursun, Kevin J. Barker, Darren J. Kerbyson...
Multi-core processors have changed the conventional hardware structure and require a rethinking of system scheduling and resource management to utilize them efficiently. However, ...
The growing computational and storage needs of several scientific applications mandate the deployment of extreme-scale parallel machines, such as IBM’s Blue Gene/L which can acc...
Basic retiming is an algorithm originally developed for hardware optimization. Software pipelining is a technique proposed to increase instruction-level parallelism for parallel p...