: We present a new approach to fault tolerance for High Performance Computing system. Our approach is based on a careful adaptation of the Algorithmic Based Fault Tolerance techniq...
George Bosilca, Remi Delmas, Jack Dongarra, Julien...
We develop a widely applicable algorithm to solve the fault diagnosis problem in certain distributed-memory multiprocessor systems in which there are a limited number of faulty pr...
We consider the problem of scheduling dependent real-time tasks for overloads on a multiprocessor system, yielding best-effort timing assurance. The application/scheduling model in...
Piyush Garyali, Matthew Dellinger, Binoy Ravindran
Massive collaborative editing becomes a reality through leading projects such as Wikipedia. This massive collaboration is currently supported with a costly central service. In ord...
The match between a peer-to-peer overlay and the physical Internet infrastructure is a constant issue. Time-constrained peer-to-peer applications such as live streaming systems ar...
Ali Boudani, Yiping Chen, Gilles Straub, Gwendal S...