Existing template-independent web data extraction approaches adopt highly ineffective decoupled strategies--attempting to do data record detection and attribute labeling in two se...
We address the issue of classifying complex data. We focus on three main sources of complexity, namely, the high dimensionality of the observed data, the dependencies between these...
We present a new approach for personalizing Web search results to a specific user. Ranking functions for Web search engines are typically trained by machine learning algorithms u...
David Sontag, Kevyn Collins-Thompson, Paul N. Benn...
This paper focuses on the problem of improving distributed query throughput of the RDBMS-based data integration system that has to inherit the query execution model of the underly...