This paper considers the problem of identifying on the Web compound documents (cDocs) ? groups of web pages that in aggregate constitute semantically coherent information entities...
Paraphrases have proved to be useful in many applications, including Machine Translation, Question Answering, Summarization, and Information Retrieval. Paraphrase acquisition meth...
The problem of hypertext classification deals with objects possessing more complex information structure than the plain text has. Present hypertext classification systems show the...
An indexing model is the heart of an Information Retrieval (IR) system. Data structures such as term based inverted indices have proved to be very effective for IR using vector sp...
Using abundant Web resources to mine Chinese term translations can be applied in many fields such as reading/writing assistant, machine translation and crosslanguage information r...