Simultaneously segmenting and labeling images is a fundamental problem in Computer Vision. In this paper, we introduce a hierarchical CRF model to deal with the problem of labelin...
In this paper, an approach for polyphonic music transcription based on joint multiple-F0 estimation and note onset/offset detection is proposed. For preprocessing, the resonator t...
We describe feature space and model space discriminative training for a new class of acoustic models called Bayesian sensing hidden Markov models (BS-HMMs). In BS-HMMs, speech dat...
Emotion recognition from speech plays an important role in developing affective and intelligent systems. This study investigates sentence-level emotion recognition. We propose to ...
Cashiers in retail stores usually exhibit certain repetitive and periodic activities when processing items. Detecting such activities plays a key role in most retail fraud detecti...