Three types of video surrogates visual (keyframes), verbal (keywords/phrases), and visual and verbal were designed and studied in a qualitative investigation of user cognitive pro...
The multimodal nature of speech is often ignored in human-computer interaction, but lip deformations and other body motion, such as those of the head, convey additional information...
Iain Matthews, Timothy F. Cootes, J. Andrew Bangha...
ImageCLEF is a pilot experiment run at CLEF 2003 for cross language image retrieval using textual captions related to image contents. In this paper, we describe the participation o...
We address evaluation of image understanding and retrieval large scale image data in the context of three evaluation projects. The first project is a comprehensive strategy for e...
Keiji Yanai, Nikhil V. Shirahatti, Prasad Gabbur, ...
Increasing use of multimedia data makes it crucial t o use intelligent search mechanisms for retrieving multimedia data by content. Digital video requires the incorporation of tem...
Serhan Dagtas, Wasfi Al-Khatib, Arif Ghafoor, Ashf...