High-level spoken document analysis is required in many applications seeking access to the semantic content of audio data, such as information retrieval, machine translation or au...
Julien Fayolle, Fabienne Moreau, Christian Raymond...
Near-duplicate detection is not only an important pre and post processing task in Information Retrieval but also an effective spam-detection technique. Among different approache...
In the last years Blog Search has been a new exciting task in Information Retrieval. The presence of user generated information with valuable opinions makes this field of huge in...
This paper presents a segmentation-based handwriting recognizer and the performance that it achieves on the numerical fields extracted from a large single-writer historical collec...
Marius Bulacu, Axel Brink, Tijn van der Zant, Lamb...
Abstract. Current text classification systems typically use term stems for representing document content. Semantic Web technologies allow the usage of features on a higher semantic...