Named Entity Recognition (NER) is an important subtask of document processing such as Information Extraction. This paper describes a NER algorithm which uses a Multi-Layer Percept...
We participate in document search and expert search of Enterprise Track in TREC2008. The corpus and tasks are same as the year before. Different from TREC 2007, the topics come fro...
Yufei Xue, Tong Zhu, Guichun Hua, Min Zhang, Yiqun...
In this paper, we report our experiments on the HARD (High Accuracy Retrieval from Documents) Track in TREC 2003. We focus on active feedback, i.e., how to intelligently propose q...
This paper introduces a procedure based on genetic programming to evolve XSLT programs (usually called stylesheets or logicsheets). XSLT is a general purpose, document-oriented fu...
—This paper describes embedding a mathematical formula recognition module into the OCR system OCRopus aiming at developing a OCR system for scientific and technical documents wh...