Large vocabulary continuous speech recognition (LVCSR) systems traditionally represent words in terms of smaller subword units. Both during training and during recognition, they re...
Within the communication process of human beings, the speaker's facial expression and lip-shape movement contains extremely rich language information. The hearing impaired, a...
Traditional photometric stereo algorithms employ a Lambertian reflectance model with a varying albedo field and involve the appearance of only one object. In this paper, we gene...
Shaohua Kevin Zhou, Gaurav Aggarwal, Rama Chellapp...
Within the field of action recognition, features and descriptors are often engineered to be sparse and invariant to transformation. While sparsity makes the problem tractable, it ...
Recognition of chatting activities in social interactions is useful for constructing human social networks. However, the existence of multiple people involved in multiple dialogue...