The World-Wide Web consists of a huge number of unstructured documents, but it also contains structured data in the form of HTML tables. We extracted 14.1 billion HTML tables from...
Michael J. Cafarella, Alon Y. Halevy, Daisy Zhe Wa...
Web information retrieval is best known for its use of the Web’s link structure as a source of evidence. Global link evidence is by nature query-independent, and is therefore no ...
Web spam is a widely-recognized threat to the quality and security of the Web. Web spam pages pollute search engine indexes, burden Web crawlers and Web mining services, and expos...
Background: Modern proteomes evolved by modification of pre-existing ones. It is extremely important to comparative biology that related proteins be identified as members of the s...
Adriano Barbosa-Silva, Venkata P. Satagopam, Reinh...
The basis of much of the intelligence on the Web is the hyperlink structure which represents an organising principle based on the human facility to be able to discriminate between...