As file systems reach the petabytes scale, users and administrators are increasingly interested in acquiring highlevel analytical information for file management and analysis. T...
H. Howie Huang, Nan Zhang 0004, Wei Wang, Gautam D...
We study the problem of generating synthetic databases having declaratively specified characteristics. This problem is motivated by database system and application testing, data ...
We present BloomUnit, a testing framework for distributed programs written in the Bloom language. BloomUnit allows developers to write declarative test specifications that descri...
Peter Alvaro, Andrew Hutchinson, Neil Conway, Will...
Shark is a research data analysis system built on a novel rained distributed shared-memory abstraction. Shark marries query processing with deep data analysis, providing a unifie...
Cliff Engle, Antonio Lupher, Reynold Xin, Matei Za...
Provenance becomes a critical requirement for healthcare IT infrastructures, especially when pervasive biomedical sensors act as a source of raw medical streams for large-scale, a...
Min Wang, Marion Blount, John Davis, Archan Misra,...