Web search engines discover indexable documents by recursively ‘crawling’ from a seed URL. Their rankings take into account link popularity. While this works well, it introduc...
Tom Rowlands, David Hawking, Ramesh Sankaranarayan...
Abstract— Recent advances in graph-based search techniques derived from Kleinberg’s work [1] have been impressive. This paper further improves the graph-based search algorithm ...
This paper considers the problem of identifying on the Web compound documents (cDocs) ? groups of web pages that in aggregate constitute semantically coherent information entities...
Today customers want to use powerful search engines for their huge and increasing content repositories. Full-text-only products with simple result lists are not enough to satisfy t...
Search engines present fix-length passages from documents ranked by relevance against the query. In this paper, we present and compare novel, language-model based methods for extr...