Large sparse matrices play important role in many modern information retrieval methods. These methods, such as clustering, latent semantic indexing, performs huge number of computa...
For many applied problems in the context of clustering via mixture models, the estimates of the component means and covariance matrices can be affected by observations that are at...
We consider the semi-supervised learning problem, where a decision rule is to be learned from labeled and unlabeled data. In this framework, we motivate minimum entropy regulariza...
Manually constructing an inventory of word senses has suffered from problems including high cost, arbitrary assignment of meaning to words, and mismatch to domains. To overcome th...
We introduce an information theoretic method for nonparametric, nonlinear dimensionality reduction, based on the infinite cluster limit of rate distortion theory. By constraining...