Learning useful and predictable features from past workloads and exploiting them well is a major source of improvement in many operating system problems. We review known parallel ...
We further increase the efficiency of Java RMI programs. Where other optimizing re-implementations of RMI use pre-processors to create stubs and skeletons and to create class spe...
In computer systems today, speed and responsiveness is often determined by network and storage subsystem performance. Faster, more scalable networking interfaces like Fibre Channe...
Kenneth W. Preslan, Andrew P. Barry, Jonathan Bras...
Fault tolerance is a very important concern for critical high performance applications using the MPI library. Several protocols provide automatic and transparent fault detection a...
Pierre Lemarinier, Aurelien Bouteiller, Thomas H&e...
As high performance clusters continue to grow in size, the mean time between failure shrinks. Thus, the issues of fault tolerance and reliability are becoming one of the challengi...