Minimization of the execution time of an iterative application in a heterogeneous parallel computing environment requires an appropriate mapping scheme for matching and scheduling...
Yu-Kwong Kwok, Anthony A. Maciejewski, Howard Jay ...
This work provides a systematic study of the impact of communication performance on parallelapplications in a high performance network of workstations. We develop an experimental ...
Richard P. Martin, Amin Vahdat, David E. Culler, T...
In this work, we propose a new FPGA design flow that combines the CUDA programming model from Nvidia with the state of the art high-level synthesis tool AutoPilot from AutoESL, to...
Abstract—Modern day enterprises have a large IT infrastructure comprising thousands of applications running on servers housed in tens of data centers geographically spread out. T...
Rahul Singh, Prashant J. Shenoy, K. K. Ramakrishna...
In Thread-Level Speculation (TLS), speculative tasks generate memory state that cannot simply be combined with the rest of the system because it is unsafe. One way to deal with th...