We describe an HTML web page segmentation algorithm, which is applied to segment online medical journal articles (regular HTML and PDF-Converted-HTML files). The web page content ...
The wikipediaMM task provides a testbed for the system-oriented evaluation of ad-hoc retrieval from a collection of Wikipedia images. It became a part of the ImageCLEF evaluation ...
We motivate and develop a natural bicriteria measure for assessing the quality of a clustering that avoids the drawbacks of existing measures. A simple recursive heuristic is shown...
In this paper we will briefly describe the approaches taken by Berkeley for the main GeoCLEF 2007 tasks (Mono and Bilingual retrieval). This year we used only a single system in ...
Swoogle is a crawler-based indexing and retrieval system for the Semantic Web documents – i.e., RDF or OWL documents. It analyzes the documents it discovered to compute useful m...
Li Ding, Timothy W. Finin, Anupam Joshi, Rong Pan,...