Sciweavers

6079 search results - page 969 / 1216
» Aspect-Oriented Process Engineering
Sort
View
KDD
2008
ACM
183views Data Mining» more  KDD 2008»
16 years 7 months ago
De-duping URLs via rewrite rules
A large fraction of the URLs on the web contain duplicate (or near-duplicate) content. De-duping URLs is an extremely important problem for search engines, since all the principal...
Anirban Dasgupta, Ravi Kumar, Amit Sasturkar
KDD
2006
ACM
198views Data Mining» more  KDD 2006»
16 years 7 months ago
Event detection from evolution of click-through data
Previous efforts on event detection from the web have focused primarily on web content and structure data ignoring the rich collection of web log data. In this paper, we propose t...
Qiankun Zhao, Tie-Yan Liu, Sourav S. Bhowmick, Wei...
KDD
2001
ACM
253views Data Mining» more  KDD 2001»
16 years 6 months ago
GESS: a scalable similarity-join algorithm for mining large data sets in high dimensional spaces
The similarity join is an important operation for mining high-dimensional feature spaces. Given two data sets, the similarity join computes all tuples (x, y) that are within a dis...
Jens-Peter Dittrich, Bernhard Seeger
POPL
2008
ACM
16 years 6 months ago
From dirt to shovels: fully automatic tool generation from ad hoc data
An ad hoc data source is any semistructured data source for which useful data analysis and transformation tools are not readily available. Such data must be queried, transformed a...
Kathleen Fisher, David Walker, Kenny Qili Zhu, Pet...
SIGMOD
2008
ACM
145views Database» more  SIGMOD 2008»
16 years 6 months ago
The Claremont report on database research
In late May, 2008, a group of database researchers, architects, users and pundits met at the Claremont Resort in Berkeley, California to discuss the state of the research field an...
Rakesh Agrawal, Anastasia Ailamaki, Philip A. Bern...