In large-scale, self-organized and distributed systems, such as peer-to-peer (P2P) overlays and wireless sensor networks (WSN), a small proportion of nodes are likely to be more c...
Large-scale donation-based distributed infrastructures need to cope with the inherent unreliability of participant nodes. A widely-used work scheduling technique in such environme...
Krishnaveni Budati, Jason D. Sonnek, Abhishek Chan...
- Due to the increasing complexity of scientific models, large-scale simulation tools often require a critical amount of computational power to produce results in a reasonable amou...
It is a challenge to design and implement a wide-area distributed hash table (DHT) which provides a storage service with high reliability. Many existing systems use replication to...
Jing Zhao, Hongliang Yu, Kun Zhang, Weimin Zheng, ...
The productivity of HPC system is determined not only by their performance, but also by their reliability. The conventional method to limit the impact of failures is checkpointing...