At the core of contemporary high performance computer systems is the communication infrastructure. For this reason, there has been a lot of work on providing low-latency, high-ban...
Sven Karlsson, Stavros Passas, George Kotsis, Ange...
We present the Stack Trace Analysis Tool (STAT) to aid in debugging extreme-scale applications. STAT can reduce problem exploration spaces from thousands of processes to a few by ...
Dorian C. Arnold, Dong H. Ahn, Bronis R. de Supins...
Stencil computations form the performance-critical core of many applications. Tiling and parallelization are two important optimizations to speed up stencil computations. Many til...
This paper describes FT64 and Multi-FT64, single- and multi-coprocessor systems designed for high performance scientific computing with streams. We give a detailed case study of po...
Mei Wen, Nan Wu, Chunyuan Zhang, Wei Wu, Qianming ...
Parallel file systems are widely used in clusters to provide high performance I/O. However, most of the existing parallel file systems are based on UNIX-like operating systems. W...