Many of the Japanese ideographs (Chinese characters) have a few meanings. Such ambiguities should be identified by using their contextual information. For example, we have an ideo...
This paper presents Japanese morphological analysis based on conditional random fields (CRFs). Previous work in CRFs assumed that observation sequence (word) boundaries were fixed...
We present an approximation to the Bayesian hierarchical PitmanYor process language model which maintains the power law distribution over word tokens, while not requiring a comput...
This paper proposes a Japanese/English crosslanguage information retrieval (CLIR) system targeting technical documents. Our system first translates a given query containing techni...
Translating between dissimilar languages requires an account of the use of divergent word orders when expressing the same semantic content. Reordering poses a serious problem for s...