Sciweavers

1815 search results - page 112 / 363
» From Web Pages to Web Communities
Sort
View
BNCOD
2006
88views Database» more  BNCOD 2006»
15 years 7 months ago
The Lixto Project: Exploring New Frontiers of Web Data Extraction
The Lixto project is an ongoing research effort in the area of Web data extraction. Whereas the project originally started out with the idea to develop a logic-based extraction lan...
Julien Carme, Michal Ceresna, Oliver Frölich,...
WWW
2006
ACM
16 years 10 days ago
Do not crawl in the DUST: different URLs with similar text
We consider the problem of dust: Different URLs with Similar Text. Such duplicate URLs are prevalent in web sites, as web server software often uses aliases and redirections, and...
Uri Schonfeld, Ziv Bar-Yossef, Idit Keidar
WWW
2008
ACM
16 years 7 months ago
Mining the search trails of surfing crowds: identifying relevant websites from user activity
The paper proposes identifying relevant information sources from the history of combined searching and browsing behavior of many Web users. While it has been previously shown that...
Mikhail Bilenko, Ryen W. White
COMPSAC
2003
IEEE
15 years 11 months ago
A Supervised Visual Wrapper Generator for Web-Data Extraction
Extracting data from Web pages using wrappers is a fundamental problem arising in a large variety of applications of vast practical interest. In this paper, we propose a novel sch...
Xiaofeng Meng, Haiyan Wang, Dongdong Hu, Chen Li
AWIC
2003
Springer
15 years 11 months ago
Formalization of Web Design Patterns Using Ontologies
Design patterns have been enthusiastically embraced in the software engineering community as well as in the web community since they capture knowledge about how and when to apply a...
Susana Montero, Paloma Díaz, Ignacio Aedo