Instruction set customization accelerates the performance of applications by compressing the length of critical dependence paths and reducing the demands on processor resources. W...
Sami Yehia, Nathan Clark, Scott A. Mahlke, Kriszti...
Load balancing is a technique which allows efficient parallelization of irregular workloads, and a key component of many applications and parallelizing runtimes. Work-stealing is ...
Maged M. Michael, Martin T. Vechev, Vijay A. Saras...
Register integration (or just integration) is a register renaming discipline that implements instruction reuse via physical register sharing. Initially developed to perform squash...
— Embedded processors are required to achieve high performance while running on batteries. Thus, they must exploit all the possible means available to reduce energy consumption w...
The widening gap between CPU complexity and verification capability is becoming increasingly more salient. It is impossible to completely verify the functionality of a modern mic...