We propose a visualization method based on a topic model for discrete data such as documents. Unlike conventional visualization methods based on pairwise distances such as multi-d...
Unsupervised clustering can be significantly improved using supervision in the form of pairwise constraints, i.e., pairs of instances labeled as belonging to same or different clu...
Several important time series data mining problems reduce to the core task of finding approximately repeated subsequences in a longer time series. In an earlier work, we formalize...
Bill Yuan-chi Chiu, Eamonn J. Keogh, Stefano Lonar...
We construct binary codes for fingerprinting. Our codes for n users that are -secure against c pirates have length O(c2 log(n/ )). This improves the codes proposed by Boneh and Sh...
One of the most common operations in analytic query processing is the application of an aggregate function to the result of a relational join. We describe an algorithm for computi...
Chris Jermaine, Alin Dobra, Subramanian Arumugam, ...