In large-scale distributed systems, asynchronous communication and future objects are becoming wide spread mechanisms to tolerate high latencies and to improve global performances...
Hardware transactional memory should support unbounded transactions: transactions of arbitrary size and duration. We describe a hardware implementation of unbounded transactional ...
C. Scott Ananian, Krste Asanovic, Bradley C. Kuszm...
The practical realization of managing and executing large scale scientific computations efficiently and reliably is quite challenging. Scientific computations often invo...
Yong Zhao, Ioan Raicu, Ian T. Foster, Mihael Hateg...
—The performance bottleneck for many scientific applications is the cost of memory access inside linear algebra kernels. Tuning such kernels for memory efficiency is a complex ...
Field-Programmable Gate Arrays (FPGAs) are being employed in high performance computing systems owing to their potential to accelerate a wide variety of long-running routines. Par...
Uday Bondhugula, Ananth Devulapalli, James Dinan, ...