To meet performance requirements as well as constraints on cost and power consumption, future embedded systems will be designed with multi-core processors. However, the question o...
Molecular Dynamics applications enhance our understanding of biological phenomena through bio-molecular simulations. Large-scale parallelization of MD simulations is challenging b...
In current superscalar processors, all floating-point resources are idle during the execution of integer programs. As previous works show, this problem can be alleviated if the fl...
Ramon Canal, Joan-Manuel Parcerisa, Antonio Gonz&a...
This paper describes a mechanism for automatic design and synthesis of very long instruction word (VLIW), and its generalization, explicitly parallel instruction computing rocesso...
This paper presents a performance analysis of an accelerated 2-D rigid image registration implementation that employs the Compute Unified Device Architecture (CUDA) programming e...