Estimating the cardinality (i.e. number of distinct elements) of an arbitrary set expression defined over multiple distributed streams is one of the most fundamental queries of in...
In this paper we argue that developing information extraction (IE) programs using Datalog with embedded procedural extraction predicates is a good way to proceed. First, compared ...
Warren Shen, AnHai Doan, Jeffrey F. Naughton, Ragh...
Sharing structured data today requires standardizing upon a single schema, then mapping and cleaning all of the data. This results in a single queriable mediated data instance. Ho...
Zachary G. Ives, Todd J. Green, Grigoris Karvounar...
Content-based dissemination of XML data using the publishsubscribe paradigm is an effective means to deliver relevant data to interested data consumers. To meet the performance ch...
Translating data and data access operations between applications and databases is a longstanding data management problem. We present a novel approach to this problem, in which the...