The state of the art for large database object retrieval in images is based on quantizing descriptors of interest points into visual words. High similarity between matching image r...
According to articulatory phonology, the gestural score is an invariant speech representation. Though the timing schemes, i.e., the onsets and offsets, of the gestural activations...
This paper presents a new probabilistic framework of Mandarin speech recognition by incorporating a sophisticated hierarchical prosody model into the conventional HMM-based system...
This paper focuses on the problem of word detection and recognition in natural images. The problem is significantly more challenging than reading text in scanned documents, and h...
Abstract—The idea of an online visual vocabulary is proposed. In contrast to the accepted strategy of generating vocabularies offline, using the k-means clustering over all the ...