Classifying and mining noise-free web pages will improve on accuracy of search results as well as search speed, and may benefit webpage organization applications (e.g., keyword-bas...
In this paper we present a methodology to extract information from the Web to build a taxonomy of terms and Web resources for a given domain. This taxonomy represents a hierarchy o...
This paper presents an architectural design and evaluation result of an efficient Web-crawling system. The design involves a fully distributed architecture, a URL allocating algor...
Understanding intents from search queries can improve a user’s search experience and boost a site’s advertising profits. Query tagging via statistical sequential labeling mode...
Ye-Yi Wang, Raphael Hoffmann, Xiao Li, Jakub Szyma...
Search engines, generally, return results without any regard for the concepts in which the user is interested. In this paper, we present our approach to personalizing search engin...