In high-end processors, increasing the number of in-flight instructions can improve performance by overlapping useful processing with long-latency accesses to the main memory. Buf...
Load balancing in a cluster system has been investigated extensively, mainly focusing on the effective usage of global CPU and memory resources. However, if a significant portion ...
Xiao Qin, Hong Jiang, Yifeng Zhu, David R. Swanson
In recent years the power of Grid computing has grown exponentially through the development of advanced middleware systems. While usage has increased, the penetration of Grid compu...
Gregor von Laszewski, Andrew J. Younge, Xi He, Kum...
As the size of parallel computers increases, as well as the number of sources per router node, congestion inside the interconnection network rises significantly. In such systems, ...
This paper presents a novel optimizing compiler for general purpose computation on graphics processing units (GPGPU). It addresses two major challenges of developing high performa...