Duplicate detection is the process of identifying multiple representations of a same real-world object in a data source. Duplicate detection is a problem of critical importance in...
Melanie Weis, Felix Naumann, Ulrich Jehle, Jens Lu...
Most database systems allow query processing over attributes that are derived at query runtime (e.g., user-defined functions and remote data calls to web services), making them e...
Justin J. Levandoski, Mohamed F. Mokbel, Mohamed E...
In recent years, there has been an explosion of publicly available RDF and OWL web pages. Some of these pages are static text files, while others are dynamically generated from la...
Stream computing research is moving from terascale to petascale levels. It aims to rapidly analyze data as it streams in from many sources and make decisions with high speed and a...
Ankur Narang, Vikas Agarwal, Monu Kedia, Vijay K. ...
Data de-duplication has become a commodity component in dataintensive systems and it is required that these systems provide high reliability comparable to others. Unfortunately, b...
Chuanyi Liu, Yu Gu, Linchun Sun, Bin Yan, Dongshen...