Duplicate URLs have brought serious troubles to the whole pipeline of a search engine, from crawling, indexing, to result serving. URL normalization is to transform duplicate URLs...
Tao Lei, Rui Cai, Jiang-Ming Yang, Yan Ke, Xiaodon...
Tagging systems have become major infrastructures on the Web. They allow users to create tags that annotate and categorize content and share them with other users, very helpful in...
We describe a framework for automatically selecting a summary set of photos from a large collection of geo-referenced photographs. Such large collections are inherently difficult ...
Alexander Jaffe, Mor Naaman, Tamir Tassa, Marc Dav...
Background: Combining multiple evidence-types from different information sources has the potential to reveal new relationships in biological systems. The integrated information ca...
Artem Lysenko, Michael Defoin-Platel, Keywan Hassa...
— As the academic world moves away from physical journals and proceedings towards online document repositories, the ability to efficiently locate work of interest among the torr...
Jayanthkumar Kannan, Beverly Yang, Scott Shenker, ...