Sciweavers

20673 search results - page 311 / 4135
» Improving Performance on the Internet
Sort
View
WWW
2008
ACM
16 years 7 months ago
Efficient similarity joins for near duplicate detection
With the increasing amount of data and the need to integrate data from multiple data sources, a challenging issue is to find near duplicate records efficiently. In this paper, we ...
Chuan Xiao, Wei Wang 0011, Xuemin Lin, Jeffrey Xu ...
WWW
2007
ACM
16 years 7 months ago
Hierarchical, perceptron-like learning for ontology-based information extraction
Recent work on ontology-based Information Extraction (IE) has tried to make use of knowledge from the target ontology in order to improve semantic annotation results. However, ver...
Yaoyong Li, Kalina Bontcheva
WWW
2007
ACM
16 years 7 months ago
Search engine retrieval of changing information
In this paper we analyze the Web coverage of three search engines, Google, Yahoo and MSN. We conducted a 15 month study collecting 15,770 Web content or information pages linked f...
Yang Sok Kim, Byeong Ho Kang, Paul Compton, Hirosh...
WWW
2007
ACM
16 years 7 months ago
Automatic searching of tables in digital libraries
Tables are ubiquitous. Unfortunately, no search engine supports table search. In this paper, we propose a novel table specific searching engine, TableSeer, to facilitate the table...
Ying Liu, Kun Bai, Prasenjit Mitra, C. Lee Giles
WWW
2007
ACM
16 years 7 months ago
Multiway SLCA-based keyword search in XML data
Keyword search for smallest lowest common ancestors (SLCAs) in XML data has recently been proposed as a meaningful way to identify interesting data nodes in XML data where their s...
Chong Sun, Chee Yong Chan, Amit K. Goenka