This work aims to provide a page segmentation algorithm which uses both visual and content information to extract the semantic structure of a web page. The visual information is u...
We address the problem that current Web applications present mainly the content-centric information, but lack cues and browsing mechanisms for online social information. After summ...
Recent work on ontology-based Information Extraction (IE) has tried to make use of knowledge from the target ontology in order to improve semantic annotation results. However, ver...
Repetition of layout structure is prevalent in document images. In document design, such repetition conveys the underlying logical and functional structure of the data. For exampl...
This research explores the interaction of textual and photographic information in document understanding. The problem of performing generalpurpose vision without apriori knowledge...