Searching and extracting meaningful information out of highly heterogeneous datasets is a hot topic that received a lot of attention. However, the existing solutions are based on e...
There has been a large amount of research on efficient document retrieval in both IR and web search areas. One important technique to improve retrieval efficiency is early termina...
We introduce the notion of "non-malleable codes" which relaxes the notion of error-correction and errordetection. Informally, a code is non-malleable if the message cont...
Stefan Dziembowski, Krzysztof Pietrzak, Daniel Wic...
Researchers increasingly use electronic communication data to construct and study large social networks, effectively inferring unobserved ties (e.g. i is connected to j) from obs...
Munmun De Choudhury, Winter A. Mason, Jake M. Hofm...
Duplicate URLs have brought serious troubles to the whole pipeline of a search engine, from crawling, indexing, to result serving. URL normalization is to transform duplicate URLs...
Tao Lei, Rui Cai, Jiang-Ming Yang, Yan Ke, Xiaodon...