Data sparseness is an ever dominating problem in automatic emotion recognition. Using artificially generated speech for training or adapting models could potentially ease this: t...
This paper provides an analysis of the impacts of machine translation and speech synthesis on speech-to-speech translation systems. The speech-to-speech translation system consist...
Kei Hashimoto, Junichi Yamagishi, William J. Byrne...
This article introduces automatic recognition of speech without any audio information. Movements of the tongue, lips, and jaw are tracked by an Electro-Magnetic Articulography (EM...
The best performing systems in the area of automatic speaker recognition have focused on using short-term, low-level acoustic information, such as sepstral features. Recently, vari...
The goal of the speech segments extraction process is to separate acoustic events of interest (the speech segment to be recognised) in a continuously recorded signal from other par...