Duplicate URLs have brought serious troubles to the whole pipeline of a search engine, from crawling, indexing, to result serving. URL normalization is to transform duplicate URLs...
Tao Lei, Rui Cai, Jiang-Ming Yang, Yan Ke, Xiaodon...
Recently there has been increasing interest in the problem of transfer learning, in which the typical assumption that training and testing data are drawn from identical distributi...
Web usage mining plays an important role in the personalization of Web services, adaptation of Web sites, and the improvement of Web server performance. It applies data mining tec...
In this paper we explore private computation built on vector addition and its applications in privacypreserving data mining. Vector addition is a surprisingly general tool for imp...
Research shows that comment spamming (comments which are unsolicited, unrelated, abusive, hateful, commercial advertisements etc) in online discussion forums has become a common p...