A definition of types in an information system is given from ld abstractions through data constructs, schema and definitions to physical data values. Category theory suggests tha...
Duplicate detection is the process of identifying multiple representations of a same real-world object in a data source. Duplicate detection is a problem of critical importance in...
Melanie Weis, Felix Naumann, Ulrich Jehle, Jens Lu...
Background: An important and yet rather neglected question related to bioinformatics predictions is the estimation of the amount of data that is needed to allow reliable predictio...
Recent research identifies a growing privacy problem that exists within Online Social Networks (OSNs). Several studies have shown how easily strangers can extract personal data a...
Much work on skewed, stochastic, high dimensional, and biased datasets usually implicitly solve each problem separately. Recently however, we have been approached by Texas Commiss...
Kun Zhang, Wei Fan, Xiaojing Yuan, Ian Davidson, X...