Freshness has been increasingly realized by commercial search engines as an important criteria for measuring the quality of search results. However, most information retrieval met...
Abstract. We present the data modeling concepts of Tricia, an opensource Java platform used to implement enterprise web information systems as well as social software solutions inc...
In this paper, we present a semi-supervised learning method for web page classification, leveraging click logs to augment training data by propagating class labels to unlabeled si...
Soo-Min Kim, Patrick Pantel, Lei Duan, Scott Gaffn...
Data fusion on the Web refers to the merging, into a unified single list, of the ranked document lists, which are retrieved in response to a user query by more than one Web search...
We present a strategy for answering fact-based natural language questions that is guided by a characterization of realworld user queries. Our approach, implemented in a system cal...