Random sampling is a well-known technique for approximate processing of large datasets. We introduce a set of algorithms for incremental maintenance of large random samples on seco...
K-Means clustering is widely used in information retrieval and data mining. Distributed K-Means variants have already been proposed, but none of the past algorithms scales to large...
Odysseas Papapetrou, Wolf Siberski, Fabian Leitrit...
There is an extensive literature on data mining techniques, including several applications of these techniques in the e-commerce setting. However, all previous approaches require t...
The continuous growth of interest in mobile applications makes the concept of location essential to design and develop software systems. Location-based software is supposed to be a...
Finding RDF individuals that refer to the same real-world entities but have different URIs is necessary for the efficient use of data across sources. The requirements for such inst...
Andriy Nikolov, Victoria S. Uren, Enrico Motta, An...