CLaRK is an XML-based software system for corpora development. It incorporates several technologies: XML technology; Unicode; Regular Cascaded Grammars; Constraints over XML Docum...
Kiril Ivanov Simov, Alexander Simov, Milen Kouylek...
A large-scale controlled vocabulary indexing system is described. The system currently covers almost 70,000 named entity topics, and applies to documents from thousands of news pu...
Identifying and fixing defects is a crucial and expensive part of the software lifecycle. Measuring the quality of bug-fixing patches is a difficult task that affects both func...
Recent work on incremental crawling has enabled the indexed document collection of a search engine to be more synchronized with the changing World Wide Web. However, this synchron...
Lipyeow Lim, Min Wang, Sriram Padmanabhan, Jeffrey...
This project implements an integrated biological information website that classifies technical documents, learns about users' interests, and offers intuitive interactive visua...
Min Hong, Anis Kairmpour-fard, Steve Russell, Lawr...