Our research addresses the efficient transfer of large data across wide-area networks, focusing on applications like remote visualization and real-time collaboration. To attain h...
Characterizing shared-memory applications provides insight to design efficient systems, and provides awareness to identify and correct application performance bottlenecks. Configu...
The MPI datatype functionality provides a powerful tool for describing structured memory and file regions in parallel applications, enabling noncontiguous data to be operated on b...
Robert B. Ross, Robert Latham, William Gropp, Ewin...
The performance skeleton of an application is a short running program whose performance in any scenario reflects the performance of the application it represents. Such a skeleton ...
Coarse-grained reconfigurable architectures (CGRAs) present an appealing hardware platform by providing programmability with the potential for high computation throughput, scalab...