: A powerful and widely-used method for analyzing the performance behavior of parallel programs is event tracing. When an application is traced, performancerelevant events, such as...
Felix Wolf, Felix Freitag, Bernd Mohr, Shirley Moo...
Multipartitioning is a skewed-cyclic block distribution that yields better parallel efficiency and scalability for line-sweep computations than traditional block partitionings. Th...
Federated simulation interfaces such as the High Level Architecture (HLA) were designed for interoperability, and as such are not traditionally associated with highperformance com...
Kalyan S. Perumalla, Alfred Park, Richard M. Fujim...
The second generation of Advanced Telecom Computing Architecture (ATCA) based on PCI Industrial Computer Manufacturers Group (PICMG) specification has evolved to a live deployment...
An asynchronous work-stealing implementation of dynamic load balance is implemented using Unified Parallel C (UPC) and evaluated using the Unbalanced Tree Search (UTS) benchmark ...