The Dryad and DryadLINQ systems offer a new programming model for large scale data-parallel computing. They generalize previous execution environments such as SQL and MapReduce in...
Data lineage and data provenance are key to the management of scientific data. Not knowing the exact provenance and processing pipeline used to produce a derived data set often re...
In response to the widespread use of the XML format for document representation and message exchange, major database vendors support XML in terms of persistence, querying and inde...
XPORT is a profile-driven distributed data dissemination system that supports an extensible set of data types, profiles types, and optimization metrics. XPORT efficiently implemen...
Integration systems typically support only a restricted set of queries over the schema they export. The reason is that the participating information sources contribute limited con...