Where Information Retrieval (IR) and Text Categorization delivers a set of (ranked) documents according to a query, users of large document collections would rather like to receiv...
In many fields where human understanding plays a crucial role, such as bioprocesses, the capacity of extracting knowledge from data is of critical importance. Within this framewor...
Feature selection, as a preprocessing step to machine learning, has been very effective in reducing dimensionality, removing irrelevant data, increasing learning accuracy, and imp...
Web search is challenging partly due to the fact that search queries and Web documents use different language styles and vocabularies. This paper provides a quantitative analysis ...
In this paper we present GDR, a Guided Data Repair framework that incorporates user feedback in the cleaning process to enhance and accelerate existing automatic repair techniques...
Mohamed Yakout, Ahmed K. Elmagarmid, Jennifer Nevi...