This paper presents a semantic confidence measure that aims to predict the relevance of automatic transcripts for a task of Spoken Document Retrieval (SDR). The proposed predicti...
In many topic identification applications, supervised training labels are indirectly related to the semantic content of the documents being classified. For example, many topical...
Biomedical literature is an important source of information for chemical compounds. However, different representations and nomenclatures for chemical entities exist, which makes th...
Tiago Grego, Piotr Pezik, Francisco M. Couto, Diet...
Abstract. We present a generic approach to readable formal proof documents, called Intelligible semi-automated reasoning (Isar). It addresses the major problem of existing interact...
In this paper we analyze our recent research on the use of document analysis techniques for metadata extraction from PDF papers. We describe a package that is designed to extract ...