— To address the rapid growth of the Internet, modern Web search engines have to adopt distributed organizations, where the collection of indexed documents is partitioned among s...
Diego Puppin, Fabrizio Silvestri, Raffaele Perego,...
In this paper, we propose a news index system which supports users who would like to observe difference in various topics (e.g. politics, economy, education, and culture) among cou...
Tomoya Noro, Bin Liu, Yosuke Nakagawa, Hao Han, Ta...
Internet is a huge source of information. Search engines have indexed much of this information and are able to extract the relevant webpages that are related to a given query. Howe...
In this work we compare different techniques to automatically find candidate web pages to substitute broken links. We extract information from the anchor text, the content of the p...
We consider the problem of sampling URLs uniformly at random from the Web. A tool for sampling URLs uniformly can be used to estimate various properties of Web pages, such as the ...
Monika Rauch Henzinger, Allan Heydon, Michael Mitz...