Web sites allow the collection of vast amounts of navigational data – clickstreams of user traversals through the site. These massive data stores offer the tantalizing possibil...
Kaushik Dutta, Debra E. VanderMeer, Anindya Datta,...
It is well known that many hard tasks considered in machine learning and data mining can be solved in a rather simple and robust way with an instanceand distance-based approach. In...
Computing a suitable measure of consensus among several clusterings on the same data is an important problem that arises in several areas such as computational biology and data mi...
Piotr Berman, Bhaskar DasGupta, Ming-Yang Kao, Jie...
The idea that context is important when predicting customer behavior has been maintained by scholars in marketing and data mining. However, no systematic study measuring how much t...
Cosimo Palmisano, Alexander Tuzhilin, Michele Gorg...
High-dimensional problems arising from robot motion planning, biology, data mining, and geographic information systems often require the computation of k nearest neighbor (knn) gr...