Commodity accelerator technologies including reconfigurable devices provide an order of magnitude performance improvement compared to mainstream microprocessor systems. A number o...
Sadaf R. Alam, Jeffrey S. Vetter, Melissa C. Smith
The DAG-based task graph model has been found effective in scheduling for performance prediction and optimization of parallel applications. However the scheduling complexity and s...
This paper presents the design and implementation of the MPI-IO interface for the Clusterfile parallel file system. The approach offers the opportunity of achieving a high corelat...
Achieving high performance on today’s architectures requires careful orchestration of many optimization parameters. In particular, the presence of shared-caches on multicore arch...
In this paper we describe a sequential simulation method to simulate large parallel homo- and heterogeneous systems on a single FPGA. The method is applicable for parallel systems...