With the current trend toward multicore architectures, improved execution performance can no longer be obtained via traditional single-thread instruction level parallelism (ILP), ...
Iterative stencil loops (ISLs) are used in many applications and tiling is a well-known technique to localize their computation. When ISLs are tiled across a parallel architecture...
In this work, we propose a new FPGA design flow that combines the CUDA programming model from Nvidia with the state of the art high-level synthesis tool AutoPilot from AutoESL, to...
In this paper, we introduce a system named Argo which provides intelligent advertising made possible from users’ photo collections. Based on the intuition that user-generated ph...
Xin-Jing Wang, Mo Yu, Lei Zhang, Rui Cai, Wei-Ying...
In modern digital ICs, the increasing demand for performance and throughput requires operating frequencies of hundreds of megahertz, and in several cases exceeding the gigahertz r...