One of the main drawbacks of the Semantic Web is the lack of semantically rich data, since most of the information is still stored in relational databases. In this paper, we presen...
The high quality, structured data from Web structured sources is invaluable for many applications. Hidden Web databases are not directly crawlable by Web search engines and are on...
The rapid growth of the web has been noted and tracked extensively. Recent studies have however documented the dual phenomenon: web pages have small half lives, and thus the web e...
Ziv Bar-Yossef, Andrei Z. Broder, Ravi Kumar, Andr...
Web object is defined to represent any meaningful object embedded in web pages (e.g. images, music) or pointed to by hyperlinks (e.g. downloadable files). Users usually search for...
In this paper, we identify and analyze structural properties which reflect the functionality of a Web site. These structural properties consider the size, the organization, the co...