The problem of document categorization is considered. The set of domains and the keywords specific for these domains is supposed to be selected beforehand as initial data. We apply...
Mikhail Alexandrov, Alexander F. Gelbukh, George L...
: Up to now, the results of applying sophisticated NL techniques to IR have been mostly disappointing. Our research aims at investigating in detail the role of syntactic analysis i...
The Grammatical Knowledge base of Contemporary Chinese contains detailed feature descriptions of the morphological and syntactic behavior of a more than fifty thousand Chinese wor...
: This paper proposes a new inference approach for Chinese probabilistic context-free grammar, which implements the EM algorithm based on the bracket matching schemes. By utilizing...
Abstract. In this paper, we describe a new unsupervised sentence boundary detection system and present a comparative study evaluating its performance against different systems foun...
Jan Strunk, Carlos Nascimento Silla Jr., Celso A. ...