Sciweavers

4178 search results - page 430 / 836
» Similarity Patterns in Language
Sort
View
WWW
2006
ACM
16 years 7 months ago
GoGetIt!: a tool for generating structure-driven web crawlers
We present GoGetIt!, a tool for generating structure-driven crawlers that requires a minimum effort from the users. The tool takes as input a sample page and an entry point to a W...
Altigran Soares da Silva, Edleno Silva de Moura, J...
WWW
2005
ACM
16 years 7 months ago
The volume and evolution of web page templates
Web pages contain a combination of unique content and template material, which is present across multiple pages and used primarily for formatting, navigation, and branding. We stu...
David Gibson, Kunal Punera, Andrew Tomkins
KDD
2005
ACM
166views Data Mining» more  KDD 2005»
16 years 7 months ago
A general model for clustering binary data
Clustering is the problem of identifying the distribution of patterns and intrinsic correlations in large data sets by partitioning the data points into similarity classes. This p...
Tao Li
DCC
2008
IEEE
16 years 6 months ago
Geometric Burrows-Wheeler Transform: Linking Range Searching and Text Indexing
We introduce a new variant of the popular Burrows-Wheeler transform (BWT) called Geometric Burrows-Wheeler Transform (GBWT). Unlike BWT, which merely permutes the text, GBWT conve...
Yu-Feng Chien, Wing-Kai Hon, Rahul Shah, Jeffrey S...
KDD
2009
ACM
174views Data Mining» more  KDD 2009»
16 years 1 months ago
Visual exploration of categorical and mixed data sets
For categorical data there does not exist any similarity measure which is as straight forward and general as the numerical distance between numerical items. Due to this it is ofte...
Sara Johansson