Biomedical research on human subjects often requires a large amount of data to be collected by personal interviews, Internet based questionnaires, lab measurements or by extracting...
Large-scale distributed data management with P2P systems requires the existence of similarity operators for queries as we cannot assume that all users will agree on exactly the sa...
Fundamental to data cleaning is the need to account for multiple data representations. We propose a formal framework that can be used to reason about and manipulate data represent...
Nowadays, there are an emergence of spatial or geographic data stored in several and heterogeneous databases, mostly in Geographic Information Systems (GIS). The diversity of GIS a...
Synthetically generated data has always been important for evaluating and understanding new ideas in database research. In this paper, we describe a data generator for generating ...