The construction of a text classifier usually involves (i) a phase of term selection, in which the most relevant terms for the classification task are identified, (ii) a phase ...
Dissemination systems are used to route information received from many publishers individually to multiple subscribers. The core of a dissemination system consists of an efficient...
Conventional optical character recognition (OCR) systems operate on individual characters and words, and do not normally exploit document or collection context. We describe a Coll...
K. Pramod Sankar, C. V. Jawahar, Raghavan Manmatha
Pseudo-relevance feedback is an effective technique for improving retrieval results. Traditional feedback algorithms use a whole feedback document as a unit to extract words for ...
The design and maintenance of an aero-engine generates a significant amount of documentation. When designing new engines, engineers must obtain knowledge gained from maintenance o...
Sylvia C. Wong, Richard M. Crowder, Gary B. Wills,...