This paper presents the design and results of the Rich Transcription Spring 2005 (RT-05S) Meeting Recognition Evaluation. This evaluation is the third in a series of community-wide...
Jonathan G. Fiscus, Nicolas Radde, John S. Garofol...
Abstract. The prosodic specification of an utterance to be spoken by a Textto-Speech synthesis system can be devised in break indices, pitch accents and boundary tones. In particu...
Curtin University’s Talking Heads (TH) combine an MPEG-4 compliant Facial Animation Engine (FAE), an Text To Emotional Speech Synthesiser (TTES), a multi-modal Dialogue Manager (...
He Xiao, Donald Reid, Andrew Marriott, E. K. Gulla...
A novel interface system for accessing geospatial data (GeoMIP) has been developed that realizes a user-centered multimodal speech/gesture interface for addressing some of the cri...
We present a generic approach to multimodal fusion which we call context based multimodal integration. Key to this approach is that every multimodal input event is interpreted and...