The aim of this paper is to evaluate the potential usefulness of the reject option for text categorisation (TC) tasks. The reject option is a technique used in statistical pattern...
We suggest a novel approach for compressing images of text documents based on building up a simple derived font from patterns in the image, and present the results of a prototype ...
We built a system for the automatic creation of a textbased topic hierarchy, meant to be used in a geographically defined community. This poses two main problems. First, the appea...
Traditional bag-of-words model and recent wordsequence kernel are two well-known techniques in the field of text categorization. Bag-of-words representation neglects the word orde...
Lei Zhang, Debbie Zhang, Simeon J. Simoff, John K....
We describe here a methodology to combine two different techniques for Semantic Relation Extraction from texts. On the one hand, generic lexicosyntactic patterns are applied to the...