RDF is the first W3C standard for enriching information resources of the Web with detailed meta data. The semantics of RDF data is defined using a RDF schema. The most expressiv...
Web crawler design presents many different challenges: architecture, strategies, performance and more. One of the most important research topics concerns improving the selection o...
This paper expands on a 1997 study of the amount and distribution of near-duplicate pages on the World Wide Web. We downloaded a set of 150 million web pages on a weekly basis ove...
In this paper we perform a study of the image contents of the Chilean web (.cl domain) using automatic feature extraction, content-based analysis and face detection algorithms. In...
Alejandro Jaimes, Javier Ruiz-del-Solar, Rodrigo V...
Research on buying behavior indicates that buying guides perform an important role in the overall buying process. However, while many buying guides can be found on the Web, findin...