The general purpose processor has long been the focus of intense optimization efforts that have resulted in an impressive doubling of performance every 18 months. However, recent ...
Christopher T. Weaver, Rajeev Krishna, Lisa Wu, To...
We investigate the problem of learning document classifiers in a multilingual setting, from collections where labels are only partially available. We address this problem in the ...
Entity information management (EIM) is a nascent IR research area that investigates the information management process about entities instead of documents. It is motivated by the ...
A novel text extraction method from graphical document images is presented in this paper. Graphical document images containing text and graphics components are considered as two-d...
Abstract. Data sets in large applications are often too massive to t completely inside the computer's internal memory. The resulting input output communication or I O between ...