We propose a new fault localization technique for software bugs in large-scale computing systems. Our technique always collects per-process function call traces of a target system...
The complexity of large computer systems has raised unprecedented challenges for system management. In practice, operators often collect large volume of monitoring data from system...
Transiently powered computing devices such as RFID tags, kinetic energy harvesters, and smart cards typically rely on programs that complete a task under tight time constraints be...
We describe the Paraflow system for connecting heterogeneous computing services together into a flexible and efficient data-mining metacomputer. There are three levels of parallel...
Fault trees provide a graphical and logical framework for analyzing the reliability of systems. A fault tree provides a conceptually simple modeling framework to represent the sys...
Ragavan Manian, Joanne Bechta Dugan, David Coppit,...