Our research aims at developing a system that paraphrases written language text to spoken language style. In such a system, it is important to distinguish between appropriate and i...
We present a grand challenge to build a corpus that will include all of the world's languages, in a consistent structure that permits large-scale cross-linguistic processing,...
This paper presents a new approach to estimate “universal” phoneme posterior probabilities for mixed language speech recognition. More specifically, we propose a new theoreti...
Speech recognition of inflectional and morphologically rich languages like Czech is currently quite a challenging task, because simple n-gram techniques are unable to capture impo...
Abstract. In this article, we propose the use of suffix arrays to efficiently implement n-gram language models with practically unlimited size n. This approach, which is used with ...