A multilevel semantic document classification system based on Support Vector Machine (SVM) in association with domain ontologies has been developed. The documents related to the s...
We report an automatic feature discovery method that achieves results comparable to a manually chosen, larger feature set on a document image content extraction problem: the locat...
We propose a new, very low complexity, single-pass, algorithm for compression of continuous tone compound documents, known as GRAFIT (GuaRAnteed FIT) that can guarantee a minimum ...
Wrapping is the process of navigating a data source, semiautomatically extracting data and transforming it into a form suitable for data processing applications. There are current...
In searching a repository of business documents, a task of interest is that of using a query signature image to retrieve from a database, other signatures matching the query. The ...
Sargur N. Srihari, Shravya Shetty, Siyuan Chen, Ha...