The growing amount of online news posted on the WWW demands new algorithms that support topic detection, search, and navigation of news documents. This work presents an algorithm f...
: Hypertext categorization is the automatic classification of web documents into predefined classes. It poses new challenges for automatic categorization because of the rich inform...
Search engine result pages (SERPs) are known as the most expensive real estate on the planet. Most queries yield millions of organic search results, yet searchers seldom look beyon...
Web search components such as ranking and query suggestions analyze the user data provided in query and click logs. While this data is easy to collect and provides information abo...
Jeff Huang, Ryen W. White, Georg Buscher, Kuansan ...
In this paper, we describe a methodology to estimate the geographic coverage of the web without the need for secondary knowledge or complex geo-tagging. This is achieved by random...
Robert Pasley, Paul Clough, Ross S. Purves, Floria...