We present a technique for augmenting annotated training data with hierarchical word clusters that are automatically derived from a large unannotated corpus. Cluster membership is...
We address the problem of learning topic hierarchies from data. The model selection problem in this domain is daunting—which of the large collection of possible trees to use? We...
David M. Blei, Thomas L. Griffiths, Michael I. Jor...
We develop data structures for dynamic closest pair problems with arbitrary (not necessarily geometric) distance functions, based on a technique previously used by the author for ...
We divide a string into k segments, each with only one sort of symbols, so as to minimize the total number of exceptions. Motivations come from machine learning and data mining. F...
A novel method for the robust identification of interpretable fuzzy models, based on the criterion that identification errors are least sensitive to data uncertainties and modelli...