Record linkage, the problem of determining when two records refer to the same entity, has applications for both data cleaning (deduplication) and for integrating data from multipl...
To cope with the large amount of biological sequences being produced, a significant number of genes and proteins have been annotated by automated tools. A protein annotation is an...
In this paper we present a novel approach for labeling clusters of multimedia content that leverages supervised classification techniques in conjunction with unsupervised cluster...
Understanding the differences between contrasting groups is a fundamental task in data analysis. This realization has led to the development of a new special purpose data mining t...
Geoffrey I. Webb, Shane M. Butler, Douglas A. Newl...
User-contributed tags have shown promise as a means of indexing multimedia collections by harnessing the combined efforts and enthusiasm of online communities. But tags are only o...