We have developed a web-repository crawler that is used for reconstructing websites when backups are unavailable. Our crawler retrieves web resources from the Internet Archive, Go...
— Many information retrieval and machine learning methods have not evolved in order to be applied to the Web. Two main problems in applying some machine learning techniques for W...
1 Tags are an important information source in Web 2.0. They can be used to describe users’ topic preferences as well as the content of items to make personalized recommendations....
Information retrieval tools and search engines have mainly been leveraging research results and technologies developed for the English language. In this paper we report the issues...
This paper introduces a web image dataset created by NUS’s Lab for Media Search. The dataset includes: (1) 269,648 images and the associated tags from Flickr, with a total of 5,...