Online aggregation is a promising solution to achieving fast early responses for interactive ad-hoc queries that compute aggregates on a large amount of data. Essential to the suc...
Information extraction (IE) — the problem of extracting structured information from unstructured text — has become an increasingly important topic in recent years. A SIGMOD 20...
Laura Chiticariu, Yunyao Li, Sriram Raghavan, Fred...
To take full advantage of the parallelism offered by a multicore machine, one must write parallel code. Writing parallel code is difficult. Even when one writes correct code, the...
John Cieslewicz, Kenneth A. Ross, Kyoho Satsumi, Y...
Edit distance based string similarity join is a fundamental operator in string databases. Increasingly, many applications in data cleaning, data integration, and scientific compu...
Similarity search methods are widely used as kernels in various data mining and machine learning applications including those in computational biology, web search/clustering. Near...