3D integration technology greatly increases transistor density while providing faster on-chip communication. 3D implementations of processors can simultaneously provide both laten...
If-conversion is a compiler technique that reduces the misprediction penalties caused by hard-to-predict branches, transforming control dependencies into data dependencies. Althou...
Network performance in tightly-coupled multiprocessors typically degrades rapidly beyond network saturation. Consequently, designers must keep a network below its saturation point...
Mithuna Thottethodi, Alvin R. Lebeck, Shubhendu S....
ABSTRACT Motivated by data value locality and quality tolerance present in multimedia applications, we propose a new micro-architecture, Region-level Approximate Computation Buffer...
Modulo scheduling is an effective code generation technique that exploits the parallelism in program loops by overlapping iterations. One drawback of this optimization is that reg...