: The k-nearest neighbour (KNN) technique is a simple yet effective method for classification. In this paper, we propose an efficient weighted nearest neighbour classification algo...
William Perrizo, Qin Ding, Maleq Khan, Anne Denton...
We exploit sketch techniques, especially the Count-Min sketch, a memory, and time efficient framework which approximates the frequency of a word pair in the corpus without explic...
Over 200 CVS repositories representing the assignments of students in a second year undergraduate computer science course have been assembled. This unique data set represents many...
Keir Mierle, Kevin Laven, Sam T. Roweis, Greg Wils...
We present a framework for clustering distributed data in unsupervised and semi-supervised scenarios, taking into account privacy requirements and communication costs. Rather than...
Automatically assigning keywords to images is of great interest as it allows one to index, retrieve, and understand large collections of image data. Many techniques have been propo...