A Focused crawler must use information gleaned from previously crawled page sequences to estimate the relevance of a newly seen URL. Therefore, good performance depends on powerfu...
Hongyu Liu, Evangelos E. Milios, Jeannette Janssen
The Web is based on a browsing paradigm that makes it di cult to retrieve and integrate data from multiple sites. Today, the only way to do this is to build specialized applicatio...
The transition from command-line interfaces to graphical interfaces has resulted in programs that are easier to learn and use, but harder to automate and reuse. Another transition...
Existing template-independent web data extraction approaches adopt highly ineffective decoupled strategies--attempting to do data record detection and attribute labeling in two se...
The Semantic Web promises automated invocation, discovery, and composition of Web services by enhancing services with semantic descriptions. An upper ontology for Web services cal...
Gerald C. Gannod, John T. E. Timm, Raynette J. Bro...