Effectively summarizing Web page collections becomes more and more critical as the amount of information continues to grow on the World Wide Web. A concise and meaningful summary ...
Yongzheng Zhang, A. Nur Zincir-Heywood, Evangelos ...
Blogs are a new form of internet phenomenon and a vast everincreasing information resource. Mining blog files for information is a very new research direction in data mining. We p...
This paper presents Multilingual Document Clustering (MDC) on comparable corpora. Wikipedia, a structured multilingual knowledge base, has been highly exploited in many monolingual...
In this paper, we propose a novel contextual descriptor which combines the contextual information and local appearance. Based on Gibbs distribution, a local descriptor is designed...
Yi Ouyang, Ming Tang, Jian Cheng, Jinqiao Wang, Ha...
Retrieving relevant information is a crucial component of cased-based reasoning systems for Internet applications such as search engines. The task is to use user-defined queries to...