Representing documents by vectors that are independent of language enhances machine translation and multilingual text categorization. We use discriminative training to create a pr...
The rapid growth of XML adoption has urged for the need of a proper representation for semi-structured documents, where the document structural information has to be taken into ac...
: Model-Driven Development concepts are exhibiting as a good engineering solution for the design of ubiquitous applications with multi-device user interfaces and other contextaware...
Marino Linaje Trigueros, Juan Carlos Preciado, Fer...
Decoding noisy document images is commonly needed in applications such as enterprise content management. Available OCR solutions are still not satisfactory especially on noisy ima...
In this paper we present a novel model based approach to detect severely broken parallel lines in noisy textual documents. It is important to detect and remove these lines so the ...