Clustering is an essential data mining task with various types of applications. Traditional clustering algorithms are based on a vector space model representation. A relational dat...
Sorting is an important component of many applications, and parallel sorting algorithms have been studied extensively in the last three decades. One of the earliest parallel sorti...
Automatically segmenting unstructured text strings into structured records is necessary for importing the information contained in legacy sources and text collections into a data ...
An important issue arising from Peer-to-Peer applications is how to accurately and efficiently retrieve a set of K best matching data objects from different sources while minimizi...
Record linkage is an important data integration task that has many practical uses for matching, merging and duplicate removal in large and diverse databases. However, a quadratic ...
Timothy de Vries, Hui Ke, Sanjay Chawla, Peter Chr...