This paper studies structured data extraction from Web pages, e.g., online product description pages. Existing approaches to data extraction include wrapper induction and automatic...
Contracts and specifications have long been used in object-oriented design, programming and testing to enhance reliability before software deployment. However, the use of specifica...
The performance of computer systems depends, among other things, on the workload. Performance evaluations are therefore often done using logs of workloads on current productions s...
XML database systems emerge as a result of the acceptance of the XML data model. Recent works have followed the promising approach of building XML database management systems on un...
Abstract. We consider a collaboration of peers autonomously crawling the Web. A pivotal issue when designing a peer-to-peer (P2P) Web search engine in this environment is query rou...
Sebastian Michel, Matthias Bender, Peter Triantafi...