In order to obtain a machine understandable semantics for web resources, research on the Semantic Web tries to annotate web resources with concepts and relations from explicitly d...
Weblog has quickly evolved into a new information and knowledge dissemination channel. Yet it is not easy to discover weblog communities through keyword search. The main contribut...
Web spam is behavior that attempts to deceive search engine ranking algorithms. TrustRank is a recent algorithm that can combat web spam. However, TrustRank is vulnerable in the s...
It has become a promising direction to measure similarity of Web search queries by mining the increasing amount of clickthrough data logged by Web search engines, which record the...
Qiankun Zhao, Steven C. H. Hoi, Tie-Yan Liu, Soura...
Increasingly, manufacturing companies are shifting their focus from selling products to providing services. As a result, when designing new products, engineers must increasingly c...
This paper presents a simple and intuitive method for mining search engine query logs to get fast query recommendations on a large scale industrial-strength search engine. In orde...
XML is the predominant format for representing structured information inside documents, but it stops at the level of files. This makes it hard to use XML-oriented tools to process...
There are principal differences between the relational model and XML's tree model. This causes problems in all cases where information from these two worlds has to be brought...
Namespaces are a central building block of XML technologies today, they provide the identification mechanism for many XML-related vocabularies. Despite their ubiquity, there is no...
This paper presents a method for finding a specification page on the web for a given object (e.g., "Titanic") and its class label (e.g., "film"). A specificati...