We present a simulation-based performance model to analyze a parallel sparse LU factorization algorithm on modern cached-based, high-end parallel architectures. We consider supern...
—Remote atomic memory operations are critical for achieving high-performance synchronization in tightly-coupled systems. Previous approaches to implementing atomic memory operati...
Keith D. Underwood, Michael Levenhagen, K. Scott H...
New heterogeneous multiprocessor platforms are emerging that are typically composed of loosely coupled components that exchange data using programmable interconnections. The compon...
We propose a radical approach to relational query processing that aims at automatically and consistently achieving a good performance on any memory hierarchy. We believe this auto...
Motivated by experience gained during the validation of a recent Approximate Mean Value Analysis (AMVA) model of modern shared memory architectures, this paper re-examines the &qu...