Content Management Systems (CMS) store enterprise data such as insurance claims, insurance policies, legal documents, patent applications, or archival data like in the case of dig...
The ability of fast similarity search at large scale is of great importance to many Information Retrieval (IR) applications. A promising way to accelerate similarity search is sem...
This paper describes the 3Book, a 3D interactive visualization of a codex book as a component for various digital library and sensemaking systems. The book is designed to hold lar...
Stuart K. Card, Lichan Hong, Jock D. Mackinlay, Ed...
We present our hybrid system for the PAN challenge at CLEF 2010. Our system performs plagiarism detection for translated and non-translated externally as well as intrinsically plag...
Markus Muhr, Roman Kern, Mario Zechner, Michael Gr...
Skew estimation and page segmentation are the two closely related processing stages for document image analysis. Skew estimation needs proper page segmentation, especially for doc...