The availability of a large, freely redistributable set of highquality annotated images is critical to allowing researchers in the area of automatic annotation, generic object rec...
This paper addresses the application of a PCA analysis on categorical data prior to diagnose a patients data set using a Case-Based Reasoning (CBR) system. The particularity is th...
This paper deals with detecting change of distribution in multi-dimensional data sets. For a given baseline data set and a set of newly observed data points, we define a statistic...
Xiuyao Song, Mingxi Wu, Christopher M. Jermaine, S...
The similarity join is an important database primitive which has been successfully applied to speed up applications such as similarity search, data analysis and data mining. The s...
When is it safe to use synthetic data in supervised classification? Trainable classifier technologies require large representative training sets consisting of samples labeled with...