This paper presents a novel networking architecture designed for communication intensive parallel applications running on clusters of workstations (COWs) connected by highspeed ne...
Marcel-Catalin Rosu, Karsten Schwan, Richard Fujim...
With the right techniques, multicore architectures may be able to continue the exponential performance trend that elevated the performance of applications of all types for decades...
Arun Raman, Hanjun Kim, Thomas R. Mason, Thomas B....
The Standard Template Adaptive Parallel Library (stapl) is a parallel programming framework that extends C++ and stl with support for parallelism. stapl provides a collection of pa...
Gabriel Tanase, Chidambareswaran Raman, Mauro Bian...
Many modern embedded processors (esp. DSPs) support partitioned memory banks (also called X-Y memory or dual bank memory) along with parallel load/store instructions to achieve co...
Xiaotong Zhuang, Santosh Pande, John S. Greenland ...
As shared-memory multiprocessors become the dominant commodity source of computation, parallelizing compilers must support mainstream computations that manipulate irregular, point...