Automated code generation and performance tuning techniques for concurrent architectures such as GPUs, Cell and FPGAs can provide integer factor speedups over multi-core processor...
As the technology for high-speed networks has evolved over the last decade, the interconnection of commodity computers (e.g., PCs and workstations) at gigabit rates has become a re...
Mark Baker, Paul A. Farrell, Hong Ong, Stephen L. ...
Abstract. Parallel Computational Science and Engineering (CSE) applications often exhibit irregular structure and dynamic load patterns. Many such applications have been developed ...
Abstract. This paper describes a mechanism for “fusing” concurrent invocations of exclusive methods. The target of our work is object-oriented languages with concurrent extensi...
Memorylatency isbecominganincreasingly importantperformance bottleneck, especially in multiprocessors. One technique for tolerating memory latency is multithreading, whereby we sw...