The use of Mercer kernel methods in statistical learning theory provides for strong learning capabilities, as seen in kernel principal component analysis and support vector machin...
As applications for artificially intelligent agents increase in complexity we can no longer rely on clever heuristics and hand-tuned behaviors to develop their programming. Even t...
Shawn Arseneau, Wei Sun, Changpeng Zhao, Jeremy R....
In this paper we show how common speech recognition training criteria such as the Minimum Phone Error criterion or the Maximum Mutual Information criterion can be extended to inco...
In many real world applications, labeled data are usually expensive to get, while there may be a large amount of unlabeled data. To reduce the labeling cost, active learning attem...
Chun Chen, Zhengguang Chen, Jiajun Bu, Can Wang, L...
High dimensional data that lies on or near a low dimensional manifold can be described by a collection of local linear models. Such a description, however, does not provide a glob...
Sam T. Roweis, Lawrence K. Saul, Geoffrey E. Hinto...