Sciweavers

703 search results - page 78 / 141
» Efficient anonymity-preserving data collection
Sort
View
ICDE
2009
IEEE
121views Database» more  ICDE 2009»
16 years 8 months ago
Large-Scale Deduplication with Constraints Using Dedupalog
We present a declarative framework for collective deduplication of entity references in the presence of constraints. Constraints occur naturally in many data cleaning domains and c...
Arvind Arasu, Christopher Ré, Dan Suciu
CIKM
2000
Springer
15 years 10 months ago
Scalable association-based text classification
Naïve Bayes (NB) classifier has long been considered a core methodology in text classification mainly due to its simplicity and computational efficiency. There is an increasing n...
Dimitris Meretakis, Dimitris Fragoudis, Hongjun Lu...
SC
2005
ACM
15 years 11 months ago
Optimized Data Loading for a Multi-Terabyte Sky Survey Repository
Advanced instruments in a variety of scientific domains are collecting massive amounts of data that must be postprocessed and organized to support research activities. Astronomers...
Y. Dora Cai, Ruth A. Aydt, Robert Brunner
KDD
2006
ACM
162views Data Mining» more  KDD 2006»
16 years 6 months ago
Simultaneous record detection and attribute labeling in web data extraction
Recent work has shown the feasibility and promise of templateindependent Web data extraction. However, existing approaches use decoupled strategies ? attempting to do data record ...
Jun Zhu, Zaiqing Nie, Ji-Rong Wen, Bo Zhang, Wei-Y...
KDD
2002
ACM
144views Data Mining» more  KDD 2002»
16 years 6 months ago
ADMIT: anomaly-based data mining for intrusions
Security of computer systems is essential to their acceptance and utility. Computer security analysts use intrusion detection systems to assist them in maintaining computer system...
Karlton Sequeira, Mohammed Javeed Zaki