This paper introduces a dynamic layout optimization strategy to minimize the number of cycles spent in memory accesses in a cache-based memory environment. In this approach, a giv...
N. E. Crosbie, Mahmut T. Kandemir, Ibrahim Kolcu, ...
The increasing numbers of cores, shared caches and memory nodes within machines introduces a complex hardware topology. High-performance computing applications now have to carefull...
This paper explains how efficient support for semiregular distributions can be incorporated in a uniform compilation framework for hybrid applications. The key focus of this work ...
Designing a cost effective superscalar architecture for x86 compatible microprocessors is a challenging task in terms of both technical difficulty and commercial value. One of the...
This paper presents a new and simple prefetching mechanism to improve the memory performance of multimedia applications. This method adapts the memory access mechanism to the acce...