Sciweavers

1486 search results - page 133 / 298
» A Document as a Small World
Sort
View
ICML
2004
IEEE
16 years 7 months ago
A needle in a haystack: local one-class optimization
This paper addresses the problem of finding a small and coherent subset of points in a given data. This problem, sometimes referred to as one-class or set covering, requires to fi...
Koby Crammer, Gal Chechik
KDD
2004
ACM
195views Data Mining» more  KDD 2004»
16 years 6 months ago
Improved robustness of signature-based near-replica detection via lexicon randomization
Detection of near duplicate documents is an important problem in many data mining and information filtering applications. When faced with massive quantities of data, traditional d...
Aleksander Kolcz, Abdur Chowdhury, Joshua Alspecto...
ERCIMDL
2008
Springer
107views Education» more  ERCIMDL 2008»
15 years 8 months ago
Revisiting Lexical Signatures to (Re-)Discover Web Pages
A lexical signature (LS) is a small set of terms derived from a document that capture the "aboutness" of that document. A LS generated from a web page can be used to disc...
Martin Klein, Michael L. Nelson
SEMWEB
2005
Springer
15 years 12 months ago
Rapid Benchmarking for Semantic Web Knowledge Base Systems
Abstract. We present a method for rapid development of benchmarks for Semantic Web knowledge base systems. At the core, we have a synthetic data generation approach for OWL that is...
Sui-Yu Wang, Yuanbo Guo, Abir Qasem, Jeff Heflin
ICIP
2004
IEEE
16 years 8 months ago
Efficient inscribing of noisy rectangular objects in scanned images
Objects identification in images is generally hard unless the objects are simple geometric shapes such as circles, rectangles or have very particular properties. Even simple geome...
Cormac Herley