The popularity of batch-oriented cluster architectures like Hadoop is on the rise. These batch-based systems successfully achieve high degrees of scalability by carefully allocati...
Broad web search engines as well as many more specialized search tools rely on web crawlers to acquire large collections of pages for indexing and analysis. Such a web crawler may...
We introduce XPORT, a profile-driven distributed data dissemination system that supports an extensible set of data types, profile types, and optimization metrics. XPORT efficientl...
Clio, the IBM Research system for expressing declarative schema mappings, has progressed in the past few years from a research prototype into a technology that is behind some of I...
Schema mapping algorithms rely on value correspondences ? i.e., correspondences among semantically related attributes ? to produce complex transformations among data sources. Thes...