Counting (identical) objects in images is a simple yet fundamental recognition task that requires exhaustive human effort. Automation of this task would reduce the human load sign...
Takumi Kobayashi, Tadaaki Hosaka, Shu Mimura, Taka...
This paper considers the problem of obtaining an accurate spectral representation of speech formant structure when the voicing source exhibits a high fundamental frequency. Our wo...
This paper introduces a framework that employs the Fisher linear discriminant model (FLDM) and classifier (FLDC) on integrated facial appearance and facial expression features. T...
Pohsiang Tsai, Tich Phuoc Tran, Tom Hintz, Tony Ja...
A common way to evaluate the performance of a system is to compare the algorithmic outputs with ground truth to identify divergences in the system’s performance and discover the...
Audio segmentation has applications in a variety of contexts, such as audio information retrieval, automatic sound analysis, and as a pre-processing step in speech recognition. Ex...
Tara N. Sainath, Dimitri Kanevsky, Giridharan Iyen...