Abstract. In this paper we describe a methodology for harvesting information from large distributed repositories (e.g. large Web sites) with minimum user intervention. The methodol...
Fabio Ciravegna, Sam Chapman, Alexiei Dingli, Yori...
Abstract. Governments often hold very rich data and whilst much of this information is published and available for re-use by others, it is often trapped by poor data structures, lo...
Harith Alani, David Dupplaw, John Sheridan, Kieron...
Site maps are frequently provided on Web sites as a navigation support for Web users. The automatic generation of site maps is a complex task since the structure of the data, sema...
Large web search engines have to answer thousands of queries per second with interactive response times. Due to the sizes of the data sets involved, often in the range of multiple...
Many tasks require users to extract information from diverse sources, to edit or process this information locally, and to explore how the end results are affected by changes in th...