This paper addresses the problem of finding a small and coherent subset of points in a given data. This problem, sometimes referred to as one-class or set covering, requires to fi...
Detection of near duplicate documents is an important problem in many data mining and information filtering applications. When faced with massive quantities of data, traditional d...
Aleksander Kolcz, Abdur Chowdhury, Joshua Alspecto...
A lexical signature (LS) is a small set of terms derived from a document that capture the "aboutness" of that document. A LS generated from a web page can be used to disc...
Abstract. We present a method for rapid development of benchmarks for Semantic Web knowledge base systems. At the core, we have a synthetic data generation approach for OWL that is...
Objects identification in images is generally hard unless the objects are simple geometric shapes such as circles, rectangles or have very particular properties. Even simple geome...