This paper proposes a method recovering audio-visual synchronization of multimedia content. It exploits the correlation between the acoustic and the visual signals in order to est...
Newborns must learn to structure incoming acoustic information into segments, words, phrases, etc., before they can start to learn language. This process is thought to rely on mod...
Jonathan Le Roux, Hirokazu Kameoka, Nobutaka Ono, ...
Texts generated by automatic speech recognition (ASR) systems have some specificities, related to the idiosyncrasies of oral productions or the principles of ASR systems, that mak...
We propose a committee-based active learning method for large vocabulary continuous speech recognition. In this approach, multiple recognizers are prepared beforehand, and the rec...
Our goal is to automatically recognize and enroll new vocabulary in a multimodal interface. To accomplish this our technique aims to leverage the mutually disambiguating aspects o...