— Many modern computer vision algorithms are built atop of a set of low-level feature operators (such as SIFT [1], [2]; HOG [3], [4]; or LBP [5], [6]) that transform raw pixel va...
Speaker diarization of meetings recorded with Multiple Distant Microphones makes extensive use of multiple feature streams like MFCC and Time Delay of Arrivals (TDOA). Typically t...
Deepu Vijayasenan, Fabio Valente, Petr Motlí...
One of the main difficulties in computing information theoretic learning (ITL) estimators is the computational complexity that grows quadratically with data. Considerable amount ...
Prior models of speech have been used in robust automatic speech recognition to enhance noisy speech. Typically, a single prior model is trained by pooling the entire training dat...
Arun Narayanan, Xiaojia Zhao, DeLiang Wang, Eric F...
— As sensors continue to proliferate, the capabilities of effectively querying not only sensor data but also its metadata becomes important in a wide range of applications. This ...