The Web has established itself as the largest public data repository ever available. Even though the vast majority of information on the Web is formatted to be easily readable by t...
Research into the Internet has experienced a tremendous growth within the field of information systems. In this sense, the recent literature focuses on more complex research topic...
This paper studies automatic extraction of structured data from Web pages. Each of such pages may contain several groups of structured data records. Existing automatic methods stil...
In the ocean of Web data, Web search engines are the primary way to access content. As the data is on the order of petabytes, current search engines are very large centralized sys...
Ricardo A. Baeza-Yates, Carlos Castillo, Flavio Ju...
In this paper we would like to present and describe SIE, a transparent, intelligent Web proxy framework. Its aim is to provide efficient and robust platform for implementing vari...
Grzegorz Andruszkiewicz, Krzysztof Ciebiera, Marci...