In this paper, we propose an approach to automatically mine event evolution graphs from newswires on the Web. Event evolution graph is a directed graph in which the vertices and e...
We address the task of learning rankings of documents from search engine logs of user behavior. Previous work on this problem has relied on passively collected clickthrough data. ...
We study the problem of creating highly compressed fulltext index structures for versioned document collections, that is, collections that contain multiple versions of each docume...
One of the most important steps in web crawling is determining the starting points, or seed selection. This paper identifies and explores the problem of seed selection in webscal...
Web users are spending more of their time and creative energies within online social networking systems. While many of these networks allow users to export their personal data or ...