Modern architectural trends in instruction-level parallelism (ILP) are to increase the computational power of microprocessors significantly. As a result, the demands on memory ha...
The key to increasing performance without a commensurate increase in power consumption in modern processors lies in increasing both parallelism and core specialization. Core speci...
We present Offload, a programming model for offloading parts of a C++ application to run on accelerator cores in a heterogeneous multicore system. Code to be offloaded is enclosed ...
Pete Cooper, Uwe Dolinsky, Alastair F. Donaldson, ...
In this paper we show cache-friendly implementations of the Floyd-Warshall algorithm for the All-Pairs ShortestPath problem. We first compare the best commercial compiler optimiza...
Transiently powered computing devices such as RFID tags, kinetic energy harvesters, and smart cards typically rely on programs that complete a task under tight time constraints be...