Sciweavers

2444 search results - page 260 / 489
» A Pattern Based Data Mining Approach
Sort
View
KDD
2004
ACM
195views Data Mining» more  KDD 2004»
16 years 7 months ago
Improved robustness of signature-based near-replica detection via lexicon randomization
Detection of near duplicate documents is an important problem in many data mining and information filtering applications. When faced with massive quantities of data, traditional d...
Aleksander Kolcz, Abdur Chowdhury, Joshua Alspecto...
ICDM
2008
IEEE
186views Data Mining» more  ICDM 2008»
16 years 1 months ago
xCrawl: A High-Recall Crawling Method for Web Mining
Web Mining Systems exploit the redundancy of data published on the Web to automatically extract information from existing web documents. The first step in the Information Extract...
Kostyantyn M. Shchekotykhin, Dietmar Jannach, Gerh...
ICDM
2003
IEEE
109views Data Mining» more  ICDM 2003»
15 years 12 months ago
Comparing Pure Parallel Ensemble Creation Techniques Against Bagging
We experimentally evaluate randomization-based approaches to creating an ensemble of decision-tree classifiers. Unlike methods related to boosting, all of the eight approaches co...
Lawrence O. Hall, Kevin W. Bowyer, Robert E. Banfi...
CIVR
2009
Springer
583views Image Analysis» more  CIVR 2009»
16 years 6 months ago
Mining from Large Image Sets
So far, most image mining was based on interactive querying. Although such querying will remain important in the future, several applications need image mining at such wide scale...
Luc J. Van Gool, Michael D. Breitenstein, Stephan ...
SC
2009
ACM
15 years 11 months ago
Web 2.0-based social informatics data grid
The Social Informatics Data Grid (SIDGrid) is a new cyberinfrastructure designed to transform how social and behavioral scientists collect and annotate data, collaborate and share...
Wenjun Wu, Thomas D. Uram, Michael E. Papka