It is well known that pragmatic knowledge is useful and necessary in many difficult language processing tasks, but because this knowledge is difficult to acquire and process autom...
User generated content is characterized by short, noisy documents, with many spelling errors and unexpected language usage. To bridge the vocabulary gap between the user's in...
Wouter Weerkamp, Krisztian Balog, Maarten de Rijke
We present a generative model for determining the information content of a message without analyzing the message content. Such a tool is useful for automated analysis of the vast ...
Yingjie Zhou, Malik Magdon-Ismail, William A. Wall...
The Semantic Web is a rapidly growing research area aiming at the exchange of semantic information over the World Wide Web. The Semantic Web is built on top of RDF, an XML-based ex...
Long-term search history contains rich information about a user's search preferences. In this paper, we study statistical language modeling based methods to mine contextual i...