Sciweavers

2277 search results - page 278 / 456
» Clustering by pattern similarity in large data sets
Sort
View
WWW
2006
ACM
16 years 7 months ago
GoGetIt!: a tool for generating structure-driven web crawlers
We present GoGetIt!, a tool for generating structure-driven crawlers that requires a minimum effort from the users. The tool takes as input a sample page and an entry point to a W...
Altigran Soares da Silva, Edleno Silva de Moura, J...
FOCS
2009
IEEE
15 years 10 months ago
Space-Efficient Framework for Top-k String Retrieval Problems
Given a set D = {d1, d2, ..., dD} of D strings of total length n, our task is to report the "most relevant" strings for a given query pattern P. This involves somewhat mo...
Wing-Kai Hon, Rahul Shah, Jeffrey Scott Vitter
KDD
2002
ACM
109views Data Mining» more  KDD 2002»
16 years 7 months ago
Topics in 0--1 data
Large 0-1 datasets arise in various applications, such as market basket analysis and information retrieval. We concentrate on the study of topic models, aiming at results which in...
Ella Bingham, Heikki Mannila, Jouni K. Seppän...
IEAAIE
2009
Springer
16 years 1 months ago
An Efficient Algorithm for Maintaining Frequent Closed Itemsets over Data Stream
Data mining refers to the process of revealing unknown and potentially useful information from a large database. Frequent itemsets mining is one of the foundational problems in dat...
Show-Jane Yen, Yue-Shi Lee, Cheng-Wei Wu, Chin-Lin...
KDD
2004
ACM
210views Data Mining» more  KDD 2004»
16 years 7 months ago
Web usage mining based on probabilistic latent semantic analysis
The primary goal of Web usage mining is the discovery of patterns in the navigational behavior of Web users. Standard approaches, such as clustering of user sessions and discoveri...
Xin Jin, Yanzan Zhou, Bamshad Mobasher