Histograms have been widely used for fast estimation of query result sizes in query optimization. In this paper, we propose a new histogram method, called the Skew-Tolerant Histog...
Yohan J. Roh, Jae Ho Kim, Yon Dohn Chung, Jin Hyun...
MapReduce is a programming model and an associated implementation for processing and generating large data sets. Users specify a map function that processes a key/value pair to ge...
Space constrained optimization problems arise in a multitude of important applications such as data warehouses and pervasive computing. A typical instance of such problems is to s...
Themistoklis Palpanas, Nick Koudas, Alberto O. Men...
This paper presents an efficient algorithm for learning Bayesian belief networks from databases. The algorithm takes a database as input and constructs the belief network structur...
Microarray technology is a powerful tool for geneticists to monitor interactions among tens of thousands of genes simultaneously. There has been extensive research on coherent sub...