We present the Lixto project, which is both a research project in database theory and a commercial enterprise that develops Web data extraction (wrapping) and Web service definiti...
Georg Gottlob, Christoph Koch, Robert Baumgartner,...
Named entities (e.g., "Kofi Annan", "Coca-Cola", "Second World War") are ubiquitous in web pages and other types of document and often provide a simpl...
Felix Weigel, Klaus U. Schulz, Levin Brunner, Edua...
With over 800 million pages covering most areas of human endeavor, the World-wide Web is a fertile ground for data mining research to make a di erence to the e ectiveness of infor...
Web cache technologies have been developed as an extension of CPU cache, by modifying LRU (Least Recently Used) algorithms. Actually in web cache systems, we can use disks and ter...
As an alternative to previous studies on extracting class attributes from unstructured text, which consider either Web documents or query logs as the source of textual data, A boo...