This paper describes a single-version algorithmic approach to design in fault tolerant computing in various computing systems by using static redundancy in order to mask transient...
This paper presents a general framework for adapting any generative (model-based) clustering algorithm to provide balanced solutions, i.e., clusters of comparable sizes. Partition...
We introduce a framework for defining a distance on the (non-Euclidean) space of Linear Dynamical Systems (LDSs). The proposed distance is induced by the action of the group of o...
Consensus is one of the most fundamental problems in fault-tolerant distributed computing. This paper proposes a mechanical method for analyzing the condition that allows one to s...
—The online detection of anomalies is a vital element of operations in data centers and in utility clouds like Amazon EC2. Given ever-increasing data center sizes coupled with th...