In a machine that follows the dynamically trace scheduled VLIW (DTSVLIW) architecture, VLIW instructions are built dynamically through an algorithm that can be implemented in hard...
Loops are the main time consuming part of programs based on floating point computations. The performance of the loops is limited either by recurrences in the computation or by the...
Abstract. The combination of a language with ne-grain implicit parallelism and a data
ow evaluation scheme is suitable for high-level programming on massively parallel architectur...
This paper introduces a new architectural approach that supports compiler-synthesized dynamic branch predication. In compiler-synthesized dynamic branch prediction, the compiler g...
David I. August, Daniel A. Connors, John C. Gyllen...
Present-day parallel computers often face the problems of large software Overheadsfor process switching and interprocessor communication. These problems are addressed by the Multi...
Herbert H. J. Hum, Kevin B. Theobald, Guang R. Gao