Natural languageprocessingNLP programsare confronted with various di culties in processing HTML and XML documents, and have the potential to produce better results if linguistic i...
Hideo Watanabe, Katashi Nagao, Michael C. McCord, ...
In this paper we examine the effects of noise when creating a real-world weblog corpus for information retrieval. We focus on the DiffPost (Lee et al. 2008) approach to noise remo...
James Lanagan, Paul Ferguson, Neil O'Hare, Alan F....
We improve the quality of statistical machine translation (SMT) by applying models that predict word forms from their stems using extensive morphological and syntactic information...
The IDEX system is a prototype of an interactive dynamic Information Extraction (IE) system. A user of the system expresses an information request in the form of a topic descripti...
Many steganographic systems are weak against visual and statistical attacks. Systems without these weaknesses offer only a relatively small capacity for steganographic messages. T...