Programming heterogeneous parallel computer systems is notoriously difficult, but MIMD models have proven to be portable across multi-core processors, clusters, and massively paral...
While trace cache, value prediction, and prefetching have been shown to be effective in the single-threaded superscalar, there has been no analysis of these techniques in a Simulta...
The implementation of embedded networked appliances requires a mix of processor cores and HW accelerators on a single chip. When designing such complex and heterogeneous SoCs, the...
Geert Vanmeerbeeck, Patrick Schaumont, Serge Verna...
Memory bandwidth issues present a formidable bottleneck to accelerating embedded applications, particularly data bandwidth for multiple-issue VLIW processors. Providing an efficie...
Paul Morgan, Richard Taylor, Japheth Hossell, Geor...
The adoption of transactional memory is hindered by the high overhead of software transactional memory and the intrusive design changes required by previously proposed TM hardware...
Jared Casper, Tayo Oguntebi, Sungpack Hong, Nathan...