Set similarity join has played an important role in many real-world applications such as data cleaning, near duplication detection, data integration, and so on. In these applicati...
Large-scale data analysis lies in the core of modern enterprises and scientific research. With the emergence of cloud computing, the use of an analytical query processing infrast...
Tomasz Nykiel, Michalis Potamias, Chaitanya Mishra...
Optimization of join queries based on average selectivities is suboptimal in highly correlated databases. In such databases, relations are naturally divided into partitions, each ...
Kostas Tzoumas, Amol Deshpande, Christian S. Jense...
Complex networks, such as biological, social, and communication networks, often entail uncertainty, and thus, can be modeled as probabilistic graphs. Similar to the problem of sim...
Michalis Potamias, Francesco Bonchi, Aristides Gio...
In this paper, we describe a main memory hybrid database system called HYRISE, which automatically partitions tables into vertical partitions of varying widths depending on how th...