Cloud computing has emerged as a new approach to large scale computing and is attracting a lot of attention from the scientific and research computing communities. Despite its gro...
The recent emergence of clouds is making the vision of utility computing realizable, i.e. computing resources and services can be delivered, utilized, and paid for as utilities su...
Hadoop is a reference software framework supporting the Map/Reduce programming model. It relies on the Hadoop Distributed File System (HDFS) as its primary storage system. Althoug...
Quality of data plays a very important role in any scientific research. In this paper we present some of the challenges that we face in managing and maintaining data quality for a...
Current projects that automate the collection of provenance information use a centralized architecture for managing the resulting metadata - that is, provenance is gathered at rem...