—Clusters and applications continue to grow in size while their mean time between failure (MTBF) is getting smaller. Checkpoint/Restart is becoming increasingly important for lar...
—In this paper, we analyze restrictions of traditional models affecting the accuracy of analytical prediction of the execution time of collective communication operations. In par...
Alexey L. Lastovetsky, Vladimir Rychkov, Maureen O...
Process variations in integrated circuits have significant impact on their performance, leakage and stability. This is particularly evident in large, regular and dense structures...
Modern DRAM systems rely on memory controllers that employ out-of-order scheduling to maximize row access locality and bank-level parallelism, which in turn maximizes DRAM bandwid...
We enrich the classical notion of group key exchange (GKE) protocols by a new property that allows each pair of users to derive an independent peer-to-peer (p2p) key on-demand and ...