We present the RGAI systems which participated in the third Web People Search Task challenge. The chief characteristics of our approach are that we focus on the raw textual parts o...
We describe Thresher, a system that lets non-technical users teach their browsers how to extract semantic web content from HTML documents on the World Wide Web. Users specify exam...
Current search mechanisms of DHT-based P2P systems can well handle a single keyword search problem. Other than single keyword search, multi-keyword search is quite popular and use...
Hanhua Chen, Hai Jin, Jiliang Wang, Lei Chen 0002,...
Current crawler-based search engines usually return a long list of search results containing a lot of noise documents. By indexing collected documents on topic path in taxonomy, t...
Current web search engines return result pages containing mostly text summary even though the matched web pages may contain informative pictures. A text excerpt (i.e. snippet) is ...