Deduplication, a key operation in integrating data from multiple sources, is a time-consuming, labor-intensive and domainspecific operation. We present our design of alias that us...
The PISAB Question Answering system is based on a combination of Information Extraction and Information Retrieval techniques. Knowledge extracted from documents is modeled as a se...
This paper presents a system for enabling offline web use to satisfy the information needs of disconnected communities. We describe the design, implementation, evaluation, and pil...
Jay Chen, Russell Power, Lakshminarayanan Subraman...
In order to minimize redundancy and optimize coverage of multiple user interests, search engines and recommender systems aim to diversify their set of results. To date, these dive...
Fusionplex is a system for integrating multiple heterogeneous and autonomous information sources that uses data fusion to resolve factual inconsistencies among the individual sour...