This paper describes an object-oriented software architecture for cluster integration and management that enables extensibility, portability, and scalability. This architecture ha...
James H. Laros III, Lee Ward, Nathan W. Dauchy, Ro...
We describe the communication infrastructure (CI) for our fault-tolerant cluster middleware, which is optimized for two classes of communication: for the applications and for the ...
Ming Li, Wenchao Tao, Daniel Goldberg, Israel Hsu,...
Supermon is a flexible set of tools for high speed, scalable cluster monitoring. Node behavior can be monitored much faster than with other commonly used methods (e.g., rstatd). ...
We have implemented a virtual machine (VM) for Java which executes on a cluster. Our cluster VM completely hides the cluster from the application, presenting a single system image...
Recently, stability-based techniques have emerged as a very promising solution to the problem of cluster validation. An inherent drawback of these approaches is the computational c...