In this paper, we have considered a real world information synthesis task, generation of a fixed length multi document summary which satisfies a specific information need. This...
—We present an OCR-driven writer identification algorithm in this paper. Our algorithm learns writer-specific characteristics more precisely from explicit character alignment usi...
With the rapid expansion and utilization of the Internet and Web technologies, there is an increasing number of on-line medical journals. On-line journals pose new challenges in t...
Daniel X. Le, Loc Q. Tran, Joseph Chow, Jongwoo Ki...
The goal of the DARPA MADCAT (Multilingual Automatic Document Classification Analysis and Translation) Program is to automatically convert foreign language text images into Englis...
Users need more sophisticated tools to handle the growing number of image-based documents available in databases. In this paper, we present a system devoted to the editing and bro...