We consider the problem of sampling URLs uniformly at random from the Web. A tool for sampling URLs uniformly can be used to estimate various properties of Web pages, such as the ...
Monika Rauch Henzinger, Allan Heydon, Michael Mitz...
Search engines present fix-length passages from documents ranked by relevance against the query. In this paper, we present and compare novel, language-model based methods for extr...
The dynamic nature of the World Wide Web makes it a challenge to find information that is both relevant and recent. Intelligent agents can complement the power of search engines to...
The large amount of available information on the Web makes it hard for users to locate resources about particular topics of interest. Traditional search tools, e.g., search engines...
Abstract. The Semantic Desktop is a means to support users in Personal Information Management (PIM). It provides an excellent test bed for Semantic Web technology: resources (e. g....