We propose a novel approach to find aliases of a given name from the web. We exploit a set of known names and their aliases as training data and extract lexical patterns that conv...
We consider the problem of deep web source selection and argue that existing source selection methods are inadequate as they are based on local similarity assessment. Specificall...
Recently a lot of work on integrating the search interfaces of multiple Web databases of the same domain into an integrated interface has been reported. Such integrated interfaces ...
We consider the problem of partitioning, in a highly accurate and highly efficient way, a set of n documents lying in a metric space into k non-overlapping clusters. We augment th...
Filippo Geraci, Marco Pellegrini, Paolo Pisati, Fa...
Photo community sites such as Flickr and Picasa Web Album host a massive amount of personal photos with millions of new photos uploaded every month. These photos constitute an ove...
Liangliang Cao, Jie Yu, Jiebo Luo, Thomas S. Huang