This paper presents two methods which automatically produce annotated corpora for text summarisation on the basis of human abstracts. Both methods identify a set of sentences from ...
Abstract. A speech act is a linguistic action intended by a speaker. It is important to analyze the speech act for the dialogue understanding system because the speech act of an ut...
Abstract. This paper proposes an approach to improve statistical word alignment with ensemble methods. Two ensemble methods are investigated: bagging and cross-validation committee...
Abstract. This paper presents our recent work on period disambiguation, the kernel problem in sentence boundary identification, with the maximum entropy (Maxent) model. A number o...
We present a preliminary study of several parser adaptation es evaluated on the GENIA corpus of MEDLINE abstracts [1,2]. We begin by observing that the Penn Treebank (PTB) is lexic...