Sciweavers

2932 search results - page 277 / 587
» Higher order mining
Sort
View
193
Voted
KDD
2008
ACM
183views Data Mining» more  KDD 2008»
16 years 7 months ago
De-duping URLs via rewrite rules
A large fraction of the URLs on the web contain duplicate (or near-duplicate) content. De-duping URLs is an extremely important problem for search engines, since all the principal...
Anirban Dasgupta, Ravi Kumar, Amit Sasturkar
KDD
2006
ACM
381views Data Mining» more  KDD 2006»
16 years 7 months ago
GPLAG: detection of software plagiarism by program dependence graph analysis
Along with the blossom of open source projects comes the convenience for software plagiarism. A company, if less self-disciplined, may be tempted to plagiarize some open source pr...
Chao Liu 0001, Chen Chen, Jiawei Han, Philip S. Yu
KDD
2009
ACM
200views Data Mining» more  KDD 2009»
16 years 1 months ago
Visual analysis of documents with semantic graphs
In this paper, we present a technique for visual analysis of documents based on the semantic representation of text in the form of a directed graph, referred to as semantic graph....
Delia Rusu, Blaz Fortuna, Dunja Mladenic, Marko Gr...
CIKM
2009
Springer
16 years 1 months ago
Potential collaboration discovery using document clustering and community structure detection
Complex network analysis is a growing research area in a wide variety of domains and has recently become closely associated with data, text and web mining. One of the most active ...
Cristian Klen dos Santos, Alexandre Evsukoff, Beat...
ICDE
2006
IEEE
133views Database» more  ICDE 2006»
16 years 22 days ago
A New Approach for Reactive Web Usage Data Processing
— Web usage mining exploits data mining techniques to discover valuable information from navigation behavior of World Wide Web (WWW) users. The required information is captured b...
Murat Ali Bayir, Ismail Hakki Toroslu, Ahmet Cosar