Clusters and distributed systems offer fault tolerance and high performance through load sharing. When all n computers are up and running, we would like the load to be evenly distr...
Initial versions of MPI were designed to work efficiently on multi-processors which had very little job control and thus static process models. Subsequently forcing them to suppor...
In this paper, we describe a scheme for tolerating and recovering from mid-query faults in a distributed shared nothing database. Rather than aborting and restarting queries, our s...
Christopher Yang, Christine Yen, Ceryen Tan, Samue...
Quorum protocols offer several benefits when used to maintain replicated data but techniques for reducing overheads associated with them have not been explored in detail. It is d...
Lei Kong, Deepak J. Manohar, Mustaque Ahamad, Arun...
Wikis have proven to be a valuable tool for collaboration and content generation on the web. Simple semantics and ease-of-use make wiki systems well suited for meeting many emergi...