Researchers in the social and behavioral sciences routinely rely on quasi-experimental designs to discover knowledge from large databases. Quasi-experimental designs (QEDs) exploi...
David D. Jensen, Andrew S. Fast, Brian J. Taylor, ...
Record linkage, the problem of determining when two records refer to the same entity, has applications for both data cleaning (deduplication) and for integrating data from multipl...
Many outlier detection methods do not merely provide the decision for a single data object being or not being an outlier but give also an outlier score or “outlier factor” sig...
There has been considerable past work studying data integration and uncertain data in isolation. We develop the foundations for local-as-view (LAV) data integration when the sourc...
Parag Agrawal, Anish Das Sarma, Jeffrey D. Ullman,...
Background: Flow Cytometry is a process by which cells, and other microscopic particles, can be identified, counted, and sorted mechanically through the use of hydrodynamic pressu...
Shareef Dabdoub, William C. Ray, Sheryl S. Justice