In this paper we present the venture of porting two different algorithms for solving the two-dimensional advection PDE on the CBE platform, an in-place and an outof-place one, and ...
Abstract—The Charm++ parallel programming system provides a modular performance interface that can be used to extend its performance measurement and analysis capabilities. The in...
Scott Biersdorff, Chee Wai Lee, Allen D. Malony, L...
With the advent of chip-multiprocessors (CMPs), Thread-Level Speculation (TLS) remains a promising technique for exploiting this highly multithreaded hardware to improve the perfo...
IP forwarding is one of the main bottlenecks in Internet backbone routers, as it requires performing the longest-prefix match at 10Gbps speed or higher. IPv6 forwarding further ex...
Despite the enormous amount of research and development work in the area of parallel computing, it is a common observation that simultaneous performance and ease-of-use are elusiv...