This paper uses the URL word breaking task as an example to elaborate what we identify as crucialin designingstatistical natural language processing (NLP) algorithmsfor Web scale ...
Kuansan Wang, Christopher Thrasher, Bo-June Paul H...
Accessing online information from various data sources has become a necessary part of our everyday life. Unfortunately such information is not always trustworthy, as different sou...
An object on the Semantic Web is likely to be denoted with multiple URIs by different parties. Object coreference resolution is to identify “equivalent” URIs that denote the ...
This paper studies the problem of discovering and comparing geographical topics from GPS-associated documents. GPSassociated documents become popular with the pervasiveness of loc...
Malicious web pages that host drive-by-download exploits have become a popular means for compromising hosts on the Internet and, subsequently, for creating large-scale botnets. In...
Davide Canali, Marco Cova, Giovanni Vigna, Christo...