This paper addresses the challenging problem of similarity search over widely distributed ultra-high dimensional data. Such an application is retrieval of the top-k most similar d...
In this paper we present a new document representation model based on implicit user feedback obtained from search engine queries. The main objective of this model is to achieve be...
Abstract. Newistic is a web mining platform that collects and analyses documents crawled from the Internet. Although it currently processes news articles, it can be easily adapted ...
Spatio-temporal, geo-referenced datasets are growing rapidly, and will be more in the near future, due to both technological and social/commercial reasons. From the data mining vie...
Distributed hash tables (DHTs) are very efficient for querying based on key lookups. However, building huge term indexes, as required for IR-style keyword search, poses a scalabil...
Odysseas Papapetrou, Wolf Siberski, Wolfgang Nejdl