Model M, a novel class-based exponential language model, has been shown to significantly outperform word n-gram models in state-of-the-art machine translation and speech recognit...
In this paper, we describe an approach for the automatic medical annotation task of the 2008 CLEF cross-language image retrieval campaign (ImageCLEF). The data comprise 12076 full...
he recent digitization of more than twenty million books has been led by initiatives from countries wishing to preserve their cultural heritage and by commercial endeavors, such a...
Bing Hu, Thanawin Rakthanmanon, Bilson J. L. Campa...
The Knowledge Discovery Toolbox (KDT) enables domain experts to perform complex analyses of huge datasets on supercomputers using a high-level language without grappling with the ...
In this paper, we propose a robust model selection criterion for mixtures of subspaces called minimum effective dimension (MED). Previous information-theoretic model selection cri...