Batched stream processing is a new distributed data processing paradigm that models recurring batch computations on incrementally bulk-appended data streams. The model is inspired...
Bingsheng He, Mao Yang, Zhenyu Guo, Rishan Chen, B...
Given a set D = {d1, d2, ..., dD} of D strings of total length n, our task is to report the "most relevant" strings for a given query pattern P. This involves somewhat mo...
In the database community, work on information extraction (IE) has centered on two themes: how to effectively manage IE tasks, and how to manage the uncertainties that arise in th...
Daisy Zhe Wang, Michael J. Franklin, Minos N. Garo...
In this paper we investigate how to scale a content based image retrieval approach beyond the RAM limits of a single computer and to make use of its hard drive to store the featur...
Large amounts of (often valuable) information are stored in web-accessible text databases. "Metasearchers" provide unified interfaces to query multiple such databases at...
Panagiotis G. Ipeirotis, Alexandros Ntoulas, Jungh...