We describe the name analysis and pronunciation component in the German version of the Bell Labs multilingual text-tospeech system. We concentrate on street names because they enc...
This paper reports on the Large Scale Hierarchical Classification workshop (http:// kmi.open.ac.uk/events/ecir2010/workshops-tutorials), held in conjunction with the European Conf...
We introduce a generative probabilistic document model based on latent Dirichlet allocation (LDA), to deal with textual errors in the document collection. Our model is inspired by...
Author identification models fall into two major categories according to the way they handle the training texts: profile-based models produce one representation per author while in...
The structure of the web is increasingly being used to improve organization, search, and analysis of information on the web. For example, Google uses the text in citing documents ...
Eric J. Glover, Kostas Tsioutsiouliklis, Steve Law...