Many text documents on the Web are not originally created but forwarded or copied from other source documents. The phenomenon of document forwarding or transmission between variou...
Enterprise and web data processing and content aggregation systems often require extensive use of human-reviewed data (e.g. for training and monitoring machine learning-based appl...
Qi Su, Dmitry Pavlov, Jyh-Herng Chow, Wendell C. B...
Previous studies have highlighted the high arrival rate of new content on the web. We study the extent to which this new content can be efficiently discovered by a crawler. Our st...
Anirban Dasgupta, Arpita Ghosh, Ravi Kumar, Christ...
This paper explores the use of social annotations to improve web search. Nowadays, many services, e.g. del.icio.us, have been developed for web users to organize and share their f...
With the rapid growth of wireless technologies and handheld devices, m-commerce is becoming a promising research area. Personalization is especially important to the success of mc...
With the growth of social bookmarking a new approach for metadata creation called tagging has emerged. In this paper we evaluate the use of tag presentation techniques. The main g...
We present a content-driven reputation system for Wikipedia authors. In our system, authors gain reputation when the edits they perform to Wikipedia articles are preserved by subs...
Continuous queries are used to monitor changes to time varying data and to provide results useful for online decision making. Typically a user desires to obtain the value of some ...
In this paper we present an application framework that leverages geospatial content on the World Wide Web by enabling innovative modes of interaction and novel types of user inter...
We propose a new system to mine visual knowledge on the Web. There are huge image data as well as text data on the Web. However, mining image data from the Web is paid less attent...