Abstract. In this paper we are dealing with the task of adding domainspecific semantic tags to a document, based solely on the domain ontology and generic lexical and Web resource...
Elias Zavitsanos, George Tsatsaronis, Iraklis Varl...
Postcorrection of OCR-results for text documents is usually based on electronic dictionaries. When scanning texts from a specific thematic area, conventional dictionaries often m...
Christian M. Strohmaier, Christoph Ringlstetter, K...
This paper presents MetaNews, an information gathering agent for news articles on the Web. MetaNews reads HTML documents from online news sites and extracts article information fro...
Keyphrases are useful for a variety of purposes, including summarizing, indexing, labeling, categorizing, clustering, highlighting, browsing, and searching. The task of automatic ...
We introduce a generative probabilistic document model based on latent Dirichlet allocation (LDA), to deal with textual errors in the document collection. Our model is inspired by...