Sciweavers

4560 search results - page 270 / 912
» Finding Data in the Neighborhood
Sort
View
KDD
2005
ACM
139views Data Mining» more  KDD 2005»
16 years 7 months ago
Reasoning about sets using redescription mining
Redescription mining is a newly introduced data mining problem that seeks to find subsets of data that afford multiple definitions. It can be viewed as a generalization of associa...
Mohammed Javeed Zaki, Naren Ramakrishnan
KDD
2003
ACM
156views Data Mining» more  KDD 2003»
16 years 7 months ago
Mining distance-based outliers in near linear time with randomization and a simple pruning rule
Defining outliers by their distance to neighboring examples is a popular approach to finding unusual examples in a data set. Recently, much work has been conducted with the goal o...
Stephen D. Bay, Mark Schwabacher
VLDB
2005
ACM
153views Database» more  VLDB 2005»
16 years 6 months ago
An effective and efficient algorithm for high-dimensional outlier detection
Abstract. The outlier detection problem has important applications in the field of fraud detection, network robustness analysis, and intrusion detection. Most such applications are...
Charu C. Aggarwal, Philip S. Yu
AAAI
2007
15 years 9 months ago
The Impact of Time on the Accuracy of Sentiment Classifiers Created from a Web Log Corpus
We investigate the impact of time on the predictability of sentiment classification research for models created from web logs. We show that sentiment classifiers are time dependen...
Kathleen T. Durant, Michael D. Smith
PVLDB
2010
195views more  PVLDB 2010»
15 years 1 months ago
Trie-Join: Efficient Trie-based String Similarity Joins with Edit-Distance Constraints
A string similarity join finds similar pairs between two collections of strings. It is an essential operation in many applications, such as data integration and cleaning, and has ...
Jiannan Wang, Guoliang Li, Jianhua Feng