: The exposed memory hierarchies employed in many network processors (NPs) are expensive in terms of meeting the worst-case processing requirement. Moreover, it is difficult to ef...
Abstract. Loops are an important source of optimization. In this paper, we address such optimizations for those cases when loops contain kernels mapped on reconfigurable fabric. We...
Ozana Silvia Dragomir, Elena Moscu Panainte, Koen ...
A very long instruction word (VLIW) processorexploits parallelism by controlling multiple operations in a single instruction word. This paper describes the architecture and compil...
Robert Cohn, Thomas R. Gross, Monica S. Lam, P. S....
Abstract. High performance computing has become a key step to introduce computer tools, like real-time registration, in the medical field. To achieve real-time processing, one usua...
—This paper proposes a parallelization scheme for parameter sweep (PS) applications using the compute unified device architecture (CUDA). Our scheme focuses on PS applications w...