Previous anti-spamming algorithms based on link structure suffer from either the weakness of the page value metric or the vagueness of the seed selection. In this paper, we propos...
Abstract. We present WBext (Web Browser extended), a web browser extended with client-side mining capabilities. WBext learns sophisticated user interests and browsing habits by tai...
In recent years, many algorithms for the Web have been developed that work with information units distinct from individual web pages. These include segments of web pages or aggreg...
In this paper we will briefly describe the approaches taken by the Berkeley Cheshire Group for the GikiCLEF task of the QA track. Because the task was intended to model some aspec...
The larger amount of information on the Web is stored in document databases and is not indexed by general-purpose search engines (i.e., Google and Yahoo). Such information is dyna...
Yih-Ling Hedley, Muhammad Younas, Anne E. James, M...