Data cleaning and ETL processes are usually modeled as graphs of data transformations. The involvement of the users responsible for executing these graphs over real data is importa...
Search engines are the primary gateways of information access on the Web today. Behind the scenes, search engines crawl the Web to populate a local indexed repository of Web pages...
The Internet and file sharing technology (such as P2P network) significantly alleviate the content distribution cost. However, better digital content distribution also means that ...
Abstract. Data mining in large databases of complex objects from scientific, engineering or multimedia applications is getting more and more important. In many areas, complex dista...
Stefan Brecheisen, Hans-Peter Kriegel, Martin Pfei...
Recent progress in information extraction technology has enabled a vast array of applications that rely on structured data that is embedded in natural-language text. In particular...