Two-dimensional contingency or co-occurrence tables arise frequently in important applications such as text, web-log and market-basket data analysis. A basic problem in contingenc...
Inderjit S. Dhillon, Subramanyam Mallela, Dharmend...
Data integrated from multiple sources may contain inconsistencies that violate integrity constraints. The constraint repair problem attempts to find "low cost" changes t...
Philip Bohannon, Michael Flaster, Wenfei Fan, Raje...
Peer data management systems (PDMS) offer a flexible architecture for decentralized data sharing. In a PDMS, every peer is associated with a schema that represents the peer's...
B-trees are the data structure of choice for maintaining searchable data on disk. However, B-trees perform suboptimally ? when keys are long or of variable length, ? when keys are...
Michael A. Bender, Martin Farach-Colton, Bradley C...
Navigational queries on Web-accessible life science sources pose unique query optimization challenges. The objects in these sources are interconnected to objects in other sources, ...
Jens Bleiholder, Samir Khuller, Felix Naumann, Lou...