Increased platform heterogeneity and varying resource availability in distributed systems motivates the design of resource-aware applications, which ensure a desired performance l...
When a performance crisis occurs in a datacenter, rapid recovery requires quickly recognizing whether a similar incident occurred before, in which case a known remedy may apply, o...
Peter Bodik, Moises Goldszmidt, Armando Fox, Dawn ...
The goal of online failure prediction is to forecast imminent failures while the system is running. This paper compares Similar Events Prediction (SEP) with two other well-known t...
With applications becoming larger and the increasing load on high performance systems, it is important to tackle the I/O bottleneck problem from several angles. It is not only ess...
Murali Vilayannur, Mahmut T. Kandemir, Anand Sivas...
The three problems of the title — the first two widely discussed in the literature, the third less well known but just as important for further development of object technology ...