Presence of duplicate documents in the World Wide Web adversely affects crawling, indexing and relevance, which are the core building blocks of web search. In this paper, we pres...
Hema Swetha Koppula, Krishna P. Leela, Amit Agarwa...
—Malicious software is rampant on the Internet and costs billions of dollars each year. Safe and thorough analysis of malware is key to protecting vulnerable systems and cleaning...
Anh M. Nguyen, Nabil Schear, HeeDong Jung, Apeksha...
This paper describes a method for hiding data inside printed text documents that is resilient to print/scan and photocopying operations. Using the principle of channel coding with...
In this paper, we present Structon, a novel approach that uses Web mining together with inference and IP traceroute to geolocate IP addresses with significantly better accuracy t...
Chuanxiong Guo, Yunxin Liu, Wenchao Shen, Helen J....
This paper introduces a web image dataset created by NUS’s Lab for Media Search. The dataset includes: (1) 269,648 images and the associated tags from Flickr, with a total of 5,...