As organizations begin to deploy large computational grids, it has become apparent that systems for observation and control of the resources, services, and applications that make ...
Data-intensive parallel applications on clouds need to deploy large data sets from the cloud's storage facility to all compute nodes as fast as possible. Many multicast algori...
Tatsuhiro Chiba, Mathijs den Burger, Thilo Kielman...
—Large-scale GPU clusters are gaining popularity in the scientific computing community. However, their deployment and production use are associated with a number of new challenge...
Volodymyr V. Kindratenko, Jeremy Enos, Guochun Shi...
The paper addresses the problem of matching and scheduling of DAG-structured application to both minimize the makespan and maximize the robustness in a heterogeneous computing sys...
As computing systems grow in complexity, the cluster and grid communities require more sophisticated tools to diagnose, debug and analyze such systems. We have developed a toolkit...
Mark K. Gardner, Wu-chun Feng, Michael Broxton, Ad...