The paper presents Bulgarian National Corpus project (BulNC) - a large-scale, representative, online available corpus of Bulgarian. The BulNC is also a monolingual general corpus,...
We have implemented an incremental lexical acquisition mechanism that learns the meanings of previously unknown words from the context in which they appear, as a part of the proce...
In Japanese, it is quite common for the same word to be written in several different ways. This is especially true for katakana words which are typically used for transliterating ...
Wikipedia provides a semantic network for computing semantic relatedness in a more structured fashion than a search engine and with more coverage than WordNet. We present experime...
Unlike traditional database queries, keyword queries do not adhere to predefined syntax and are often dirty with irrelevant words from natural languages. This makes accurate and e...