Performance evaluation using only a subset of programs from a benchmark suite is commonplace in computer architecture research. This is especially true during early design space e...
A dynamic binary translation system for a co-designed virtual machine is described and evaluated. The underlying hardware directly executes an accumulator-oriented instruction set...
Empirical search is a strategy used during the installation of library generators such as ATLAS, FFTW, and SPIRAL to identify the algorithm or the version of an algorithm that del...
Instruction set customization accelerates the performance of applications by compressing the length of critical dependence paths and reducing the demands on processor resources. W...
Sami Yehia, Nathan Clark, Scott A. Mahlke, Kriszti...
—CMOS scaling has long been a source of dramatic performance gains. However, semiconductor feature size reduction has resulted in increasing levels of operating temperatures and ...
Shantanu Gupta, Shuguang Feng, Amin Ansari, Scott ...