Sciweavers

4651 search results - page 242 / 931
» A Data Quality Browser
Sort
View
SIGMOD
2008
ACM
167views Database» more  SIGMOD 2008»
16 years 6 months ago
DiMaC: a system for cleaning disguised missing data
In some applications such as filling in a customer information form on the web, some missing values may not be explicitly represented as such, but instead appear as potentially va...
Ming Hua, Jian Pei
SIGMOD
2010
ACM
207views Database» more  SIGMOD 2010»
15 years 6 months ago
Leveraging spatio-temporal redundancy for RFID data cleansing
Radio Frequency Identification (RFID) technologies are used in many applications for data collection. However, raw RFID readings are usually of low quality and may contain many an...
Haiquan Chen, Wei-Shinn Ku, Haixun Wang, Min-Te Su...
SP
2008
IEEE
176views Security Privacy» more  SP 2008»
16 years 1 months ago
Casting out Demons: Sanitizing Training Data for Anomaly Sensors
The efficacy of Anomaly Detection (AD) sensors depends heavily on the quality of the data used to train them. Artificial or contrived training data may not provide a realistic v...
Gabriela F. Cretu, Angelos Stavrou, Michael E. Loc...
WEBDB
2009
Springer
131views Database» more  WEBDB 2009»
16 years 1 months ago
Functional Dependency Generation and Applications in Pay-As-You-Go Data Integration Systems
Recently, the opportunity of extracting structured data from the Web has been identified by a number of research projects. One such example is that millions of relational-style H...
Daisy Zhe Wang, Xin Luna Dong, Anish Das Sarma, Mi...
ICDM
2002
IEEE
122views Data Mining» more  ICDM 2002»
15 years 11 months ago
Using Category-Based Adherence to Cluster Market-Basket Data
In this paper, we devise an efficient algorithm for clustering market-basket data. Different from those of the traditional data, the features of market-basket data are known to b...
Ching-Huang Yun, Kun-Ta Chuang, Ming-Syan Chen