We introduce obstruction-freedom, a new nonblocking property for shared data structure implementations. This property is strong enough to avoid the problems associated with locks,...
Hardware prefetching is a simple and effective technique for hiding cache miss latency and thus improving the overall performance. However, it comes with addition of prefetch buff...
Graphics and media processing is quickly emerging to become one of the key computing workloads. Programmable graphics processors give designers extra flexibility by running a sma...
Mauricio Breternitz Jr., Herbert H. J. Hum, Sanjee...
This paper is devoted to the study of number representations and algorithms leading to efficient implementations of modular adders and multipliers on recent Field Programmable Ar...
UPC is an explicit parallel extension of ANSI C, which has been gaining rising attention from vendors and users. In this work, we consider the low-level monitoring and experimenta...