Sciweavers

17688 search results - page 386 / 3538
» Data Set Balancing
Sort
View
MIR
2010
ACM
226views Multimedia» more  MIR 2010»
16 years 1 months ago
Automatically annotating the MIR Flickr dataset: experimental protocols, openly available data and semantic spaces
The availability of a large, freely redistributable set of highquality annotated images is critical to allowing researchers in the area of automatic annotation, generic object rec...
Jonathon S. Hare, Paul H. Lewis
HIS
2008
15 years 8 months ago
Diagnosing Patients Combining Principal Components Analysis and Case Based Reasoning
This paper addresses the application of a PCA analysis on categorical data prior to diagnose a patients data set using a Case-Based Reasoning (CBR) system. The particularity is th...
Carles Pous, Dani Caballero, Beatriz López
KDD
2007
ACM
112views Data Mining» more  KDD 2007»
16 years 7 months ago
Statistical change detection for multi-dimensional data
This paper deals with detecting change of distribution in multi-dimensional data sets. For a given baseline data set and a set of newly observed data points, we define a statistic...
Xiuyao Song, Mingxi Wu, Christopher M. Jermaine, S...
SIGMOD
2001
ACM
193views Database» more  SIGMOD 2001»
16 years 6 months ago
Epsilon Grid Order: An Algorithm for the Similarity Join on Massive High-Dimensional Data
The similarity join is an important database primitive which has been successfully applied to speed up applications such as similarity search, data analysis and data mining. The s...
Christian Böhm, Bernhard Braunmüller, Fl...
DRR
2009
15 years 4 months ago
Using synthetic data safely in classification
When is it safe to use synthetic data in supervised classification? Trainable classifier technologies require large representative training sets consisting of samples labeled with...
Jean Nonnemaker, Henry Baird