Abstract. We describe the structure of a compilation system that generates code for processor architectures supporting both explicit and implicit parallel threads. Such architectur...
A novel dynamic register renaming approach is proposed in this work. The key idea of the novel scheme is to delay the allocation of physical registers until a late stage in the pi...
As the gap between memory and processor speeds continues to widen, cache efficiency is an increasingly important component of processor performance. Compiler techniques have been...
Brad Calder, Chandra Krintz, Simmi John, Todd M. A...
Improvements in main memory speeds have not kept pace with increasing processor clock frequency and improved exploitation of instruction-level parallelism. Consequently, the gap b...
Predicated execution is a promising architectural feature for exploiting instruction-level parallelism in the presence of control flow. Compiling for predicated execution involve...