The Merrimac supercomputer uses stream processors and a highradix network to achieve high performance at low cost and low power. The stream architecture matches the capabilities o...
Mattan Erez, Jung Ho Ahn, Ankit Garg, William J. D...
In this paper we present a new approach for generating high-speed optimized event-driven register transfer level (RTL) compiled simulators. The generation of the simulators is part...
Many processor architectures provide a set of addressing modes in their address generation units. For example DSPs (digital signal processors) have powerful addressing modes for e...
Over the past decade, the trajectory to the petascale has been built on increased complexity and scale of the underlying parallel architectures. Meanwhile, software developers hav...
Increases in instruction level parallelism are needed to exploit the potential parallelism available in future wide issue architectures. Predicated execution is an architectural m...
Lori Carter, Beth Simon, Brad Calder, Larry Carter...