Distributed storage systems employ replicas or erasure code to ensure high reliability and availability of data. Such replicas create great amount of network traffic that negative...
: The concept of natural neighbors employs the notion of distance to define local neighborhoods in discrete data. Especially when querying and accessing large scale data, it is im...
Scientific applications require sophisticated data management capabilities. We present the design and implementation of a Data Replication Service (DRS), one of a planned set of h...
Ann L. Chervenak, Robert Schuler, Carl Kesselman, ...
A fundamental building block of many data mining and analysis approaches is density estimation as it provides a comprehensive statistical model of a data distribution. For that re...
In this paper we measured and analyzed the workload on Yahoo! Video, the 2nd largest U.S. video sharing site, to understand its nature and the impact on online video data center d...