In many data sharing settings, such as within the biological and biomedical communities, global data consistency is not always attainable: different sites' data may be dirty,...
The generalized hypertree width GHW(H) of a hypergraph H is a measure of its cyclicity. Classes of conjunctive queries or constraint satisfaction problems whose associated hypergr...
Recent years have seen growing interest in effective algorithms for summarizing and querying massive, high-speed data streams. Randomized sketch synopses provide accurate approxima...
Graham Cormode, Minos N. Garofalakis, Dimitris Sac...
In situations where class labels are known for a part of the objects, a cluster analysis respecting this information, i.e. semi-supervised clustering, can give insight into the cl...
Micro-data protection is a hot topic in the field of Statistical Disclosure Control (SDC), that has gained special interest after the disclosure of 658000 queries by the AOL searc...