Although the Web lets users freely browse and publish information, most Web information is unauthorized in contrast to conventional mass media. Therefore, it is not always credibl...
In this paper we investigate how “self-awareness'', through on-line self-monitoring and measurement, coupled with intelligent adaptive behaviour in response to observe...
Web browser history detection using CSS visited styles has long been dismissed as an issue of marginal impact. However, due to recent changes in Web usage patterns, coupled with br...
In this paper, we study the problem of Web forum crawling. Web forum has now become an important data source of many Web applications; while forum crawling is still a challenging ...
Yida Wang, Jiang-Ming Yang, Wei Lai, Rui Cai, Lei ...
This paper uses the URL word breaking task as an example to elaborate what we identify as crucialin designingstatistical natural language processing (NLP) algorithmsfor Web scale ...
Kuansan Wang, Christopher Thrasher, Bo-June Paul H...