We have created ZapC, a novel system for transparent coordinated checkpoint-restart of distributed network applications on commodity clusters. ZapC provides a thin virtualization ...
Clusters and distributed systems offer fault tolerance and high performance through load sharing. When all computers are up and running, we would like the load to be evenly distri...
Consistency is an important issue in Distributed Shared Memory (DSM) systems. These systems share a set of objects or virtual memory pages. The data sharing enables the applicatio...
Clusters of SMPs are becoming increasingly common. However, the shared memory design of SMPs and the consequential contention between system processors for access to main memory c...
One approach in verifying the correctness of a multiprocessor system is to show that its execution results comply with the memory consistency model it is meant to implement. It ha...