The field of speaker identification has recently seen significant advancement, but improvements have tended to be benchmarked on near-field speech, ignoring the more realistic set...
In this study, a system that discriminates laughter from speech by modelling the relationship between audio and visual features is presented. The underlying assumption is that thi...
This paper summarizes recent work at Microsoft on the development of novel direct models. The key characteristic of our approaches is the use of long-span segment level features t...
Tracking multiple objects is important in many application domains. We propose a novel algorithm for multi-object tracking that is capable of working under very challenging conditi...
—Via collaborative beamforming, nodes in a wireless network are able to transmit a common message over long distances in an energy efficient fashion. However, the process of mak...