We present a method for very high-dimensional correlation analysis. The method relies equally on rigorous search strategies and on human interaction. At each step, the method cons...
Two central criteria for data quality are consistency and accuracy. Inconsistencies and errors in a database often emerge as violations of integrity constraints. Given a dirty dat...
In this paper, we consider the problem of keyword query cleaning for structured databases from a probabilistic approach. Keyword query cleaning consists of rewriting the user quer...
In response to the widespread use of the XML format for document representation and message exchange, major database vendors support XML in terms of persistence, querying and inde...
With the emergence of applications that require content-based similarity retrieval, techniques to support such a retrieval paradigm over database systems have emerged as a critica...
Michael Ortega-Binderberger, Kaushik Chakrabarti, ...