Books and magazines often contain pages containing audacious mixtures of color images and text. Our problem consists in coding the background colors of a such documents without wa...
Semantics can be integrated in to search processing during both document analysis and querying stages. We describe a system that incorporates both, semantic annotations of Wikipedi...
We consider the problem of modeling the content structure of texts within a specific domain, in terms of the topics the texts address and the order in which these topics appear. W...
We investigate how the normalization of vectors influences the result of SVMs. 1 Normalization For the theoretical background, please refer to [1]. 2 Experiments We empirically co...
In this paper, we describe the algorithm that has been used to carry out our plagiarism detection within the context of PAN10 competition. Our system is based on the LempelZiv dist...