We present a framework to extract the most important features (tree fragments) from a Tree Kernel (TK) space according to their importance in the target kernelbased machine, e.g. ...
Image spam is a new obfuscating method which spammers invented to more effectively bypass conventional text based spam filters. In this paper, a framework for filtering image spam...
The separation of Chinese character and English character is helpful for OCR technique. In this paper, a multi-level cascade classifier combined with feature selection is construc...
Yuanping Zhu, Jun Sun 0004, Akihiro Minagawa, Yosh...
: This paper describes a new approach to document classification based on visual features alone. Text-based retrieval systems perform poorly on noisy text. We have conducted serie...
: An OCR free word spotting method is developed and evaluated under a strong experimental protocol. Different feature sets are evaluated under the same experimental conditions. In ...
Israel Rios, Alceu de Souza Britto Jr., Alessandro...