This paper presents a novel method for extracting information from collections of Web pages across different sites. Our method uses a standard wrapper induction algorithm and explo...
In this paper, we describe a methodology to estimate the geographic coverage of the web without the need for secondary knowledge or complex geo-tagging. This is achieved by random...
Robert Pasley, Paul Clough, Ross S. Purves, Floria...
One important task of semantic web portals is to offer both end users and applications a seamless access to knowledge contained in heterogeneous data sources in specific user commu...
Web users display their preferences implicitly by navigating through a sequence of pages or by providing numeric ratings to some items. Web usage mining techniques are used to ext...
Many Web services operate their own Web crawlers to discover data of interest, despite the fact that largescale, timely crawling is complex, operationally intensive, and expensive...
Jonathan M. Hsieh, Steven D. Gribble, Henry M. Lev...