Abstract. Topic models are a discrete analogue to principle component analysis and independent component analysis that model topic at the word level within a document. They have ma...
A vector model based information retrieval of handwritten medical forms is presented in this paper. In order to improve the IR performance on the erroneous output of handwriting r...
This paper shows an approach for converting bitmap images of text glyphs into a vector format which is suitable for being embedded in XML representations of digitized documents. T...
Stefan Pletschacher, Marcel Eckert, Arved C. H&uum...
Information retrieval (IR) is an effective mechanism for text management that has received widespread adoption in the world at large. But it is not a particularly creative mechanis...
A perturbation model for generating synthetic textlines from existing cursively handwritten lines of text produced by human writers is presented. Our purpose is to improve the per...