Sixearch.org is a peer application for social, distributed, adaptive Web search, which integrates the Sixearch.org protocol, a topical crawler, a document indexing system, a retri...
Finding information is a problem shared by people and intelligent systems. This paper describes an experiment combining both human and machine aspects in a knowledgebased system t...
Despite the widespread use of BM25, there have been few studies examining its effectiveness on a document description over single and multiple field combinations. We determine t...
Tables are ubiquitous in web pages and scientific documents. With the explosive development of the web, tables have become a valuable information repository. Therefore, effective...
With the increasing amount of data and the need to integrate data from multiple data sources, a challenging issue is to find near duplicate records efficiently. In this paper, we ...
Chuan Xiao, Wei Wang 0011, Xuemin Lin, Jeffrey Xu ...