The University of Maryland participated in three TREC-6 tasks: ad hoc retrieval, cross-language retrieval, and spoken document retrieval. The principal focus of the work was evalu...
We propose a novel approach that identifies web page templates and extracts the unstructured data. Extracting only the body of the page and eliminating the template increases the ...
In this paper we compare performance of several heuristics in generating informative generic/query-oriented extracts for newspaper articles in order to learn how topic prominence ...
Mailing list archives (i.e., the compilation of the messages posted up-to-now) are often published on the web and indexed by conventional search engines.They store a vast knowledg...
Recent years have witnessed a dramatic increase in the quantity of image data collected, due to advances in fields such as medical imaging, reconnaissance, surveillance, astronomy...