In this paper, an approach for polyphonic music transcription based on joint multiple-F0 estimation and note onset/offset detection is proposed. For preprocessing, the resonator t...
In context-dependent acoustic modeling, it is important to strike a balance between detailed modeling and data sufficiency for robust estimation of model parameters. In the past,...
This paper describes recent advances at LIMSI in Mandarin Chinese speech-to-text transcription. A number of novel approaches were introduced in the different system components. Th...
Lori Lamel, Jean-Luc Gauvain, Viet-Bac Le, Ilya Op...
Cross-lingual voice transformation is challenging when source language (L1) and target language (L2) are very different in corresponding phonetics and prosodies. We propose a fram...
It is well known that MFCC based speaker identification (SID) systems easily break down under mismatched training and test conditions. One such mismatch occurs when a SID system ...