1 We consider the problem of similarity search in applications where the cost of computing the similarity between two records is very expensive, and the similarity measure is not a...
Chris Jermaine, Fei Xu, Mingxi Wu, Ravi Jampani, T...
Data quality is a critical problem in modern databases. Data entry forms present the first and arguably best opportunity for detecting and mitigating errors, but there has been li...
Kuang Chen, Harr Chen, Neil Conway, Joseph M. Hell...
In this paper, we present a novel method to adapt the temporal radio maps for indoor location determination by offsetting the variational environmental factors using data mining t...
We describe an approach for multi-modal characterization of social media by combining text features (e.g. tags as a prominent example of short, unstructured text labels) with spat...
Maximum margin clustering (MMC) has recently attracted considerable interests in both the data mining and machine learning communities. It first projects data samples to a kernel...