Current practice in the design of application software for high-performance embedded computing systems is characterized by long development times, lack of interoperability with ot...
Abstract. We present a uni ed approach for expressing high performance numerical linear algebra routines for a class of dense and sparse matrix formats and shapes. As with the Stan...
This paper evaluates the use of per-node multi-threading to hide remote memory and synchronization latencies in a software DSM. As with hardware systems, multi-threading in softwa...
To maintain the integrity of the US nuclear stockpile without detonating nuclear weapons, the DOE needs the results of computer-simulations that overwhelm the world's most po...
Many algorithms to schedule DAGs on multiprocessors have been proposed, but there has been little work done to determine their effectiveness. Since multi-processor scheduling is a...
Carolyn McCreary, A. A. Khan, J. J. Thompson, M. E...