—One of the critical challenges for service oriented computing systems is the capability to guarantee scalable and reliable service provision. This paper presents Reliable GeoGri...
Gong Zhang, Ling Liu, Sangeetha Seshadri, Bhuvan B...
To observe, analyze and control large scale distributed systems and the applications hosted on them, there is an increasing need to continuously monitor performance attributes of ...
Shicong Meng, Srinivas R. Kashyap, Chitra Venkatra...
Memory bugs in C/C++ programs severely affect system availability and security. This paper presents First-Aid, a lightweight runtime system that survives software failures caused ...
Device drivers are notorious for being a major source of failure in operating systems. In analysing a sample of real defects in Linux drivers, we found that a large proportion (39...
Leonid Ryzhyk, Peter Chubb, Ihor Kuz, Gernot Heise...
The database tier of dynamic content servers at large Internet sites is typically hosted on centralized and expensive hardware. Recently, research prototypes have proposed using d...