Sciweavers

2367 search results - page 178 / 474
» Measuring and Comparing Effectiveness of Data Quality Techni...
Sort
View
146
Voted
IJBIDM
2007
91views more  IJBIDM 2007»
15 years 6 months ago
An efficient weighted nearest neighbour classifier using vertical data representation
: The k-nearest neighbour (KNN) technique is a simple yet effective method for classification. In this paper, we propose an efficient weighted nearest neighbour classification algo...
William Perrizo, Qin Ding, Maleq Khan, Anne Denton...
EMNLP
2011
14 years 6 months ago
Approximate Scalable Bounded Space Sketch for Large Data NLP
We exploit sketch techniques, especially the Count-Min sketch, a memory, and time efficient framework which approximates the frequency of a word pair in the corpus without explic...
Amit Goyal, Hal Daumé III
MSR
2005
ACM
16 years 4 days ago
Mining student CVS repositories for performance indicators
Over 200 CVS repositories representing the assignments of students in a second year undergraduate computer science course have been assembled. This unique data set represents many...
Keir Mierle, Kevin Laven, Sam T. Roweis, Greg Wils...
165
Voted
ICDM
2003
IEEE
112views Data Mining» more  ICDM 2003»
15 years 12 months ago
Privacy-preserving Distributed Clustering using Generative Models
We present a framework for clustering distributed data in unsupervised and semi-supervised scenarios, taking into account privacy requirements and communication costs. Rather than...
Srujana Merugu, Joydeep Ghosh
ECCV
2008
Springer
16 years 8 months ago
A New Baseline for Image Annotation
Automatically assigning keywords to images is of great interest as it allows one to index, retrieve, and understand large collections of image data. Many techniques have been propo...
Ameesh Makadia, Vladimir Pavlovic, Sanjiv Kumar