We describe a system for extracting mentions of terms such as company and product names, in a large and noisy corpus of documents, such as the World Wide Web. Since natural langua...
Einat Amitay, Rani Nelken, Wayne Niblack, Ron Siva...
There are major trends to advance the functionality of search engines to a more expressive semantic level. This is enabled by the advent of knowledge-sharing communities such as W...
Searching digital biomedical images is a challenging problem. Prevalent retrieval techniques involve human-supplied text annotations to describe image contents. Biomedical images,...
Annotated speech corpora are databases consisting of signal data along with time-aligned symbolic ‘transcriptions’. Such databases are typically multidimensional, heterogeneou...
The challenge of automatically summarising Web pages and sites is a great one. However, currently there is no solution which offers an easy way to produce unbiased, coherent , and...