The amount of information available on the Web has increased rapidly, reaching levels that few would ever have imagined possible. We live in what could be called the "informa...
Online photo services such as Flickr and Zooomr allow users to share their photos with family, friends, and the online community at large. An important facet of these services is ...
Large-scale text categorization is an important research topic for Web data mining. One of the challenges in large-scale text categorization is how to reduce the amount of human e...
While much of the data on the web is unstructured in nature, there is also a significant amount of embedded structured data, such as product information on e-commerce sites or sto...
In this paper we describe preliminary work that examines whether statistical properties of the structure of websites can be an informative measure of their quality. We aim to deve...
Vaclav Petricek, Tobias Escher, Ingemar J. Cox, He...