MPI Alltoall is one of the most communication intense collective operation used in many parallel applications. Recently, the supercomputing arena has witnessed phenomenal growth o...
Rahul Kumar, Amith R. Mamidala, Dhabaleswar K. Pan...
Central to many problems in scene understanding based on using a network of tens, hundreds or even thousands of randomly distributed cameras with on-board processing and wireless c...
Shubao Liu, Kongbin Kang, Jean-Philippe Tarel and ...
Abstract. Limited bandwidth to off-chip main memory is a performance bottleneck in chip multiprocessors for streaming computations, such as Cell/B.E., and this will become even mor...
Effective use of the memory hierarchy is critical for achieving high performance on embedded systems. We focus on the class of streaming applications, which is increasingly preval...
Janis Sermulins, William Thies, Rodric M. Rabbah, ...
Abstract—Network-on-Chips (NoCs) paradigm is fast becoming a defacto standard for designing communication infrastructure for multicores with the dual goals of reducing power cons...
Dominic DiTomaso, Avinash Kodi, Savas Kaya, David ...