Abstract—The Sparse Matrix-Vector Multiplication kernel exhibits limited potential for taking advantage of modern shared memory architectures due to its large memory bandwidth re...
Kornilios Kourtis, Georgios I. Goumas, Nectarios K...
This paper presents a helper thread prefetching scheme that is designed to work on loosely-coupled processors, such as in a standard chip multi-processor (CMP) system and in an in...
Changhee Jung, Daeseob Lim, Jaejin Lee, Yan Solihi...
Accurate estimation of signal delay is critical to the design and verification of VLSI circuits. At very high frequencies, signal delay in circuits with small feature sizes is do...
Many practical applications generate irregular, nonbalanced divide-and-conquer trees which have different depths, possibly also different numbers of successors at different levels...
Instruction fetch bandwidth is feared to be a major limiting factor to the performance of future wide-issue aggressive superscalars. In this paper, we focus on Database applicatio...