With the right techniques, multicore architectures may be able to continue the exponential performance trend that elevated the performance of applications of all types for decades...
Arun Raman, Hanjun Kim, Thomas R. Mason, Thomas B....
Commodity hardware and software are growing increasingly more complex, with advances such as chip heterogeneity and specialization, deeper memory hierarchies, ne-grained power ma...
As shared-memory multiprocessors become the dominant commodity source of computation, parallelizing compilers must support mainstream computations that manipulate irregular, point...
This paper describes a proposal for a set of Parallel Basic Linear Algebra Subprograms PBLAS. The PBLAS are targeted at distributed vector-vector, matrix-vector and matrixmatrix...
Jaeyoung Choi, Jack Dongarra, Susan Ostrouchov, An...
The growth of commercial and academic interest in parallel and distributed computing during the past fifteen years has been accompanied by a corresponding increase in the number o...
Gregory V. Wilson, Jonathan Schaeffer, Duane Szafr...