Data lineage and data provenance are key to the management of scientific data. Not knowing the exact provenance and processing pipeline used to produce a derived data set often re...
With the explosion of social media, scalability becomes a key challenge. There are two main aspects of the problems that arise: 1) data volume: how to manage and analyze huge data...
Ching-Yung Lin, Jimeng Sun, Nan Cao, Shixia Liu, S...
This paper studies the problem of learning a full range of pairwise affinities gained by integrating local grouping cues for spectral segmentation. The overall quality of the spect...
Tae Hoon Kim (Seoul National University), Kyoung M...
We consider the problem of deep web source selection and argue that existing source selection methods are inadequate as they are based on local similarity assessment. Specificall...
We propose a novel method, based on concepts from expander graphs, to sample communities in networks. We show that our sampling method, unlike previous techniques, produces subgra...