Cross-lingual voice transformation is challenging when source language (L1) and target language (L2) are very different in corresponding phonetics and prosodies. We propose a fram...
This article presents an attempt to link the uploaders of videos based on the audio track of the videos. Using a subset of the MediaEval [10] Placing Task’s Flickr video set, wh...
Howard Lei, Jaeyoung Choi, Adam Janin, Gerald Frie...
In this paper, we present experiments on continuous time, continuous scale affective movie content recognition (emotion tracking). A major obstacle for emotion research has been t...
This paper presents a unified model for image editing in terms of Sparse Matrix-Vector (SpMV) multiplication. In our framework, we cast image editing as a linear energy minimizat...
This paper presents ongoing research leveraging forensic methods for automatic speaker recognition. Some of the methods forensic scientists employ include identifying speaker dist...
Kyu J. Han, Mohamed Kamal Omar, Jason W. Pelecanos...