The World-Wide-Web is less agent-friendly than we might hope. Most information on the Web is presented in loosely structured natural language text with no agent-readable semantics...
Early TREC-style Question Answering Systems were characterized by the following features: (a) the answer of the question was known to be included in a given local corpus, (b) the ...
This paper considers the problem of identifying on the Web compound documents (cDocs) ? groups of web pages that in aggregate constitute semantically coherent information entities...
Previous efforts on event detection from the web have focused primarily on web content and structure data ignoring the rich collection of web log data. In this paper, we propose t...
Qiankun Zhao, Tie-Yan Liu, Sourav S. Bhowmick, Wei...
Although most of existing research usually detects events by analyzing the content or structural information of Web documents, a recent direction is to study the usage data. In th...