In this paper we propose a new information-theoretic divisive algorithm for word clustering applied to text classification. In previous work, such "distributional clustering&...
Inderjit S. Dhillon, Subramanyam Mallela, Rahul Ku...
In this paper, we propose the "Democratic Classifier", a simple, democracy-inspired patternbased classification algorithm that uses very short patterns for classificatio...
We present a query-driven algorithm for the distributed indexing of large document collections within structured P2P networks. To cope with bandwidth consumption that has been ide...
Gleb Skobeltsyn, Toan Luu, Ivana Podnar Zarko, Mar...
—The identification of a person on the basis of scanned images of handwriting is a useful biometric modality with application in forensic and historic document analysis and const...
Abstract—In this paper, we present a novel languageindependent algorithm for extracting text-lines from handwritten document images. Our algorithm is based on the seam carving ap...