World Wide Web (WWW) is a vast source of information, the problem of information overload is more acute than ever. Due to noise in WWW, it is becoming hard to find usable informati...
In this paper, we use the structural and relational information on the Web to find entity-pages. Specifically, given a Web site and an entity-page (e.g., department and faculty ...
Tim Weninger, Fabio Fumarola, Cindy Xide Lin, Rick...
Due to the rapid acceptance of web services and its fast spreading, a number of mission-critical systems will be deployed as web services in next years. The availability of those ...
Jorge Salas, Francisco Perez-Sorrosal, Marta Pati&...
Web applications facilitated by technologies such as JavaScript, DHTML, AJAX, and Flash use a considerable amount of dynamic web content that is either inaccessible or unusable by...
Yevgen Borodin, Jeffrey P. Bigham, Rohit Raman, I....
We consider the problem of template-independent news extraction. The state-of-the-art news extraction method is based on template-level wrapper induction, which has two serious li...
Junfeng Wang, Xiaofei He, Can Wang, Jian Pei, Jiaj...